Posts

Post not yet marked as solved
2 Replies
0 Views
Some models have considerable memory overhead when running inference on an iOS device. We've seen crashes due to process running out of memory. It would be nice to see the total memory overhead incurred by running inference. Breakdown of memory consumption per compute unit or even per layer would be nice, but I'm guessing this isn't even well defined, so an overall "This is the amount of memory pressure using this model in your app is expected to add" would be great.