Getting ValueError: Categorical Cross Entropy loss layer input (Identity) must be a softmax layer output.

I am working on the neural network classifier provided on the coremltools.readme.io in the updatable->neural network section(https://coremltools.readme.io/docs/updatable-neural-network-classifier-on-mnist-dataset).

I am using the same code but I get an error saying that the coremltools.converters.keras.convert does not exist. But this I know can be coreml version issue. Right know I am using coremltools version 6.2. I converted this model to mlmodel with .convert only. It got converted successfully.

But I face an error in the make_updatable function saying the loss layer must be softmax output. Even the coremlt package API reference there I found its because the layer name is softmaxND but it should be softmax.

Now the problem is when I convert the model from Keras sequential model to coreml model. the layer name and type change. And the softmax changes to softmaxND.

Does anyone faced this issue?

if I execute this builder.inspect_layers(last=4) I get this output

[Id: 32], Name: sequential/dense_1/Softmax (Type: softmaxND) Updatable: False Input blobs: ['sequential/dense_1/MatMul'] Output blobs: ['Identity']

[Id: 31], Name: sequential/dense_1/MatMul (Type: batchedMatmul) Updatable: False Input blobs: ['sequential/dense/Relu'] Output blobs: ['sequential/dense_1/MatMul']

[Id: 30], Name: sequential/dense/Relu (Type: activation) Updatable: False Input blobs: ['sequential/dense/MatMul'] Output blobs: ['sequential/dense/Relu']

In the make_updatable function when I execute builder.set_categorical_cross_entropy_loss(name='lossLayer', input='Identity')

I get this error

ValueError: Categorical Cross Entropy loss layer input (Identity) must be a softmax layer output.

Replies

Hey, I was having the same issue with in converting a pytorch model over (softmaxND). I think I found a work around that may or may not work for you. I'm not a keras user so apologies for my lack of knowledge of that framework. In pytorch is is encouraged to not actually add the softmax layer at the end-- and rather use one of the built-in loss functions which apply it as part of the loss function evaluation for numerical reasons. So it is actually better for me if my original torch model doesn't include that layer. Either way if you can peel that softmax layer off, then you can just add it back in to make coremltools happy.

Here is a snippet of my function.

def make_updateable(mlmodel):
    model_spec = mlmodel.get_spec()
    builder = ct.models.neural_network.NeuralNetworkBuilder(spec=model_spec, mode='classifier')
    # Manually add a softmax at the end
    builder.add_softmax(name='output_prob', input_name='output', output_name='output_prob')
    # Your update layer names.....
    builder.make_updatable(update_layer_names)
    print(builder.inspect_layers())
    print(builder.layers)
    print(builder.inspect_output_features())

    builder.set_sgd_optimizer(SgdParams(lr=0.01, batch=10))
    builder.set_epochs(10)
    feature = ('output', datatypes.Array(1))
    builder.set_categorical_cross_entropy_loss(name='lossLayer', input='output_prob')


    # Save the CoreML model
    updateable_model = ct.models.MLModel(model_spec)
    updateable_model.save('classifier_updateable.mlmodel')

Hope that helps.