Dynamic input size strongly impact neural engine performances

We use dynamic input size for some uses cases. When compute unit mode is .all there is strong difference in the execution time if the dynamic input shape doesn’t fit with the optimal shape. If we set the model optimal input shape as 896x896 but run it with an input shape of 1024x768 the execution time is almost twice as slower compared to an input size of 896x896.

For example a model set with 896x896 preferred input shape can achieve inference at 66 ms when input shape is 896x896. However this model only achieve inference at 117 ms when input shape is 1024x768.

In that case if we want to achieve best performances at inference time we need to switch from a model to another in function of the input shape which is not dynamic at all and memory greedy. There is a way to reduce the execution time when shape out of the preferred shape range?

Post not yet marked as solved Up vote post of dbphr Down vote post of dbphr
892 views

Replies

Hello, this is a valid observation. Support for range flexibility with Neural Engine does have a few limitations. Do you mind filing a bug report on feedbackassistant.apple.com? One way to work around this issue is using enumerated flexibility. This allows you to enumerate all shapes ahead of time so, CoreML can prepare the model for inference on Neural Engine for all those shapes ahead of time. Is that possible for your use case?