CoreML / Vision - Penultimate network layout output?

I have a working CoreML app which loads the InceptionV3 mlmodel supplied by Apple on the Machine Learning page. (https://developer.apple.com/machine-learning/)


I am able to successfully run inference on pixel buffers, however, I am curious if it is possble to introspect the penultimate layer of the neural networks output for a inference operation to get a feature vector of weights - these weights are quite useful for doing similarity metrics in an abstract, high dimensional 'feature space'


Looking at the Vision and CoreML documentation it does not appear that one can introspect a model for this information - and looking deeper at BNN in accelerate.framework, it appears there is a BNNSVectorDescriptor - however I see no way to climb down from CoreML or Vision to the underlying accelerate operations.


Does one have to edit a model to explicely mark multiple outputs? If so, how can one edit an existing mlmodel? The file format docs are not particularly insightful (to me).


Thank you!

Post not yet marked as solved Up vote post of vade Down vote post of vade
1.5k views

Answers

There does not seem to be a way to do this with Core ML. However, if you implement the model with Metal Performance Shaders you have full control over all the layers. Apple has a sample project for MPS that uses Inception-v3 that you can look at. But it's definitely not as easy to use as Core ML!

Thank you for trying out our beta version! Although Core ML API has been kept simple for the ease of use, more custom cases can be handled by changing the .mlmodel file.

What you want can be done by modifying the .mlmodel using coremltools. You would need to know a bit about the structure of the mlmodel to carry out this procedure. Please refer to the CoreML specification .proto files.


Depending on your use case there are two possibilities:


If you just need the intermediate output:


Steps to modify the model:

  1. Load the model
  2. Print some info about the last few layers of the neural network
  3. Figure out the shape of the intermediate output blob to which you need access
  4. Remove the last layer(s) that you do not need
  5. Remove the corresponding output descriptions.
  6. Add the desired output in the model description. Make sure the shape is correct.
  7. Modify the neural network type to generic from classifier
  8. Save as a new model


Example code to do this, with inceptionV3 model provided in the Core ML model gallery:


import coremltools

spec = coremltools.models.utils.load_spec("InceptionV3.mlmodel")
print spec.WhichOneof('Type')
nn = spec.neuralNetworkClassifier

#look at last few layer types, inputs and output names
for i in range(-1,-4,-1):
  layer = nn.layers[i]
  print(" layer index = %d, Type: %s, input: %s, output: %s" %(i, layer.WhichOneof('layer'), layer.input[0], layer.output[0]))

C = nn.layers[-2].innerProduct.inputChannels
print 'Input channels for the Inner product layer: ', C

#Remove last 2 layers: InnerProduct, Softmax
del nn.layers[-1]
del nn.layers[-1]

#Remove the 2 output descriptions (top class label and class probabilities)
del spec.description.output[-1]
del spec.description.output[-1]

#Add an output description
new_output = spec.description.output.add()
new_output.name = 'flatten_output' #same as the output name of the intermediate layer we want to access
new_output.shortDescription = 'penultimate layer output, before softmax'
new_output_params = new_output.type.multiArrayType
new_output_params.dataType = coremltools.proto.FeatureTypes_pb2.ArrayFeatureType.ArrayDataType.Value('DOUBLE')
new_output_params.shape.extend([1,1,C,1,1])

#Change the model type to a generic neuralNetwork
#copy information about layers and preprocessing
layers_copy = nn.layers
preprocessing_copy = nn.preprocessing
spec.neuralNetwork.layers.extend(layers_copy)
spec.neuralNetwork.preprocessing.extend(preprocessing_copy)
print spec.WhichOneof('Type')

#save spec
coremltools.utils.save_spec(spec, "InceptionV3_modified.mlmodel")



If you need the intermediate as well as the final classifier output:


Currently, only “dangling” output blobs ( that do not feed into any other layer) are exposed as overall model outputs. You can grab an intermediate output and expose it as a dangling output.


Steps to modify the mlmodel:


  1. Load the model
  2. Print some info about the last few layers of the neural network
  3. Figure out the shape of the intermediate output blob to which you need access
  4. Add an identity layer to fetch the desired output
  5. Add an output in the model description. Make sure the shape is correct.
  6. Save as a new model


This gives you an mlmodel that returns the original information as well as the new intermediate output blob.


Example code to do this, with inceptionV3 model provided in the Core ML model gallery:


import coremltools

spec = coremltools.models.utils.load_spec("InceptionV3.mlmodel")
print spec.WhichOneof('Type')
nn = spec.neuralNetworkClassifier

#look at last few layer types, inputs and output names
for i in range(-1,-4,-1):
  layer = nn.layers[i]
  print(" layer index = %d, Type: %s, input: %s, output: %s" %(i, layer.WhichOneof('layer'), layer.input[0], layer.output[0]))

C = nn.layers[-2].innerProduct.inputChannels
print 'Input channels for the Inner product layer: ', C

#expose the intermediate output blob as a dangling output layer
new_layer = nn.layers.add()
new_layer.name = 'my_intermediate_output_layer' #give any name here
new_layer.input.append('flatten_output') #same as the output name of the intermediate layer we want to access
new_layer.output.append('my_intermediate_output') #give any name here
new_layer.activation.linear.alpha = 1.0 #we add a "linear" layer with alpha==scale==1, which is an identity transformation

#add a new output description
new_output = spec.description.output.add()
new_output.name = 'my_intermediate_output' #same name as the output of the newly added layer above
new_output.shortDescription = 'Flatten layer output, penultimate layer'
new_output_params = new_output.type.multiArrayType
new_output_params.dataType = coremltools.proto.FeatureTypes_pb2.ArrayFeatureType.ArrayDataType.Value('DOUBLE')
new_output_params.shape.extend([1,1,C,1,1]) #shape should be in order [Seq, Batch, channel, height, width] or [channel, height, width]

#save out the modified spec
coremltools.utils.save_spec(spec, "InceptionV3_modified.mlmodel")


In future updates, we may expose the ability to access output blobs that are not dangling. In that case, there would be no need to add an additional identity layer. Simply adding a new output layer description with the correct name and shape would suffice.