I have an mlprogram of size 127.2MB it was created using tensorflow and then converted to CoreML. When I request a prediction the amount of memory shoots up to 2-2.5GB every time. I've tried using the optimization techniques in coremltools but nothing seems to work it still shoots up to the same 2-2.5GB of ram every time. I've attached a graph to see it doesn't seem to be a leak as the memory is then going back down.
CoreML model using excessive ram during prediction
Hi @Michi314,
I am running into a similar issue, where the memory usage is around 2GB for a 100MB model. Did you find a solution to this?
When instantiating the CoreML model try to pass a MLModelConfiguration object. Play around with the config's computeUnits options. For example:
let config MLModelConfiguration()
config.computeUnits = .cpuOnly
let model = MLModel(contentsOf: localModelUrl, configuration: config)
In my case .cpuOnly
worked best, but it's different for different models, try the other options as well.