CoreML model using excessive ram during prediction

I have an mlprogram of size 127.2MB it was created using tensorflow and then converted to CoreML. When I request a prediction the amount of memory shoots up to 2-2.5GB every time. I've tried using the optimization techniques in coremltools but nothing seems to work it still shoots up to the same 2-2.5GB of ram every time. I've attached a graph to see it doesn't seem to be a leak as the memory is then going back down.

Hi @Michi314,

I am running into a similar issue, where the memory usage is around 2GB for a 100MB model. Did you find a solution to this?

When instantiating the CoreML model try to pass a MLModelConfiguration object. Play around with the config's computeUnits options. For example:

let config MLModelConfiguration()
config.computeUnits = .cpuOnly

let model = MLModel(contentsOf: localModelUrl, configuration: config)

In my case .cpuOnly worked best, but it's different for different models, try the other options as well.

Core ML runtime needs to allocate intermediate tensors, which can be large depending on the model architecture, data type, and the compute device.

It's hard to say more without looking at the actual model. We would appreciate if you submit a problem report through Feedback assistance with the model attached. A few important data points to be included in the report are:

  1. The Model (.mlpackage, .mlmodel, or .mlmodelc file)
  2. The compute unit (See https://developer.apple.com/documentation/coreml/mlmodelconfiguration/computeunits)
  3. (If the model uses a flexible shape) the shape of the actual input.
CoreML model using excessive ram during prediction
 
 
Q