CoreML not using Neural Engine even though it should

When I run the performance test on a CoreML model, it shows predictions are 834% faster running on the Neural Engine as it is on the GPU.

It also shows, that 100% of the model can run on the Neural Engine:

GPU only:

But when I set the compute units to all:

let config = MLModelConfiguration()
config.computeUnits = .all

and profile, it shows that the neural engine isn’t used at all. Well, other than loading the model which takes 25 seconds when allowed to use the neural engine versus less than a second when not allowing the neural engine:

The difference in speed is the difference between the app being too slow to even release versus quite reasonable performance. I have a lot of work invested in this, so I am really hoping that I can get it to run on the Neural Engine.

Why isn't it actually running on the Neural Engine when it shows that it is supported and I have the compute unit set to run on the Neural Engine?

Accepted Reply

I figured it out; apparently flexible shapes do not run on the ANE.

I really wish this was documented; the docs just state to use enumerated shapes for best performance.

But in this case, using flexible shapes is nearly 10 times slower and I don't understand why they are supported at all with that kind of penalty.

It would have saved me much trouble not having flexible shapes since I now need to refactor inference in shipped products. Good chance that is why one of the products I spent six months of my life developing has largely been a flop. Very frustrating.

  • Hi! Could you please explain what enumerated shapes and flexible shapes mean? To my understanding you are talking about CNN and that we can identify beforehand what shape the data is going to be after each layer. Correct?

    If so, how can I provide such shapes to CoreML configuration?

  • Enumerated shapes allows you to specify a list of multiple input and output dimensions (eg: 64, 128, 256). Flexible shapes allows you to specify a range of shape dimensions (eg 64-256)

    The shape for each layer is based on the input shape.

    Documentation for enumerated and flexible shapes.

  • Hi @3DTOPO . Thanks for posting the question as well as your findings! wondering if you know if enumerated and flexible shapes are supported by GPU? I ran into an issue that some operators are not supported by GPU when using enumerated and flexible shapes. thanks in advance!

Replies

I figured it out; apparently flexible shapes do not run on the ANE.

I really wish this was documented; the docs just state to use enumerated shapes for best performance.

But in this case, using flexible shapes is nearly 10 times slower and I don't understand why they are supported at all with that kind of penalty.

It would have saved me much trouble not having flexible shapes since I now need to refactor inference in shipped products. Good chance that is why one of the products I spent six months of my life developing has largely been a flop. Very frustrating.

  • Hi! Could you please explain what enumerated shapes and flexible shapes mean? To my understanding you are talking about CNN and that we can identify beforehand what shape the data is going to be after each layer. Correct?

    If so, how can I provide such shapes to CoreML configuration?

  • Enumerated shapes allows you to specify a list of multiple input and output dimensions (eg: 64, 128, 256). Flexible shapes allows you to specify a range of shape dimensions (eg 64-256)

    The shape for each layer is based on the input shape.

    Documentation for enumerated and flexible shapes.

  • Hi @3DTOPO . Thanks for posting the question as well as your findings! wondering if you know if enumerated and flexible shapes are supported by GPU? I ran into an issue that some operators are not supported by GPU when using enumerated and flexible shapes. thanks in advance!