Can't apply compression techniques on my CoreML Object Detection model.

import coremltools as ct
from coremltools.models.neural_network import quantization_utils

# load full precision model
model_fp32 = ct.models.MLModel(modelPath)

model_fp16 = quantization_utils.quantize_weights(model_fp32, nbits=16)

model_fp16.save("reduced-model.mlmodel")

I'm testing it with the model from one of Apple's source codes(GameBoardDetector), and it works fine, reduces the model size by half. But there are several problems with my model(trained on CreateML app using Full Network):

  1. Quantizing to float 16 does not work(new file gets created with reduced only 0.1mb).
  2. Quantizing to below 16 values cause errors, and no file gets created.

Here are additional metadata and precisions of models.

Working model's additional metadata and precision:

Mine's additional metadata and precision:

Thanks for trying. If you train/save an object detector (OD) with the CreateML app today, either Full Network or Transfer Learning, they'd both be saved with FP16 weights. The GameBoardDetector model was trained with Transfer Learning, but saved with fp32 weights at the time, which is no longer the case.

This is why you observed model size saving in your experiment for GameBoardDetector.mlmodel that came with the sample app, but not the re-trained Full Network OD model.

The error it ran into when going lower bits is something we can look into.

Here is the error and full output.

("Failed to load" appears even in successful operation, so seems like it's not the issue)

Can't apply compression techniques on my CoreML Object Detection model.
 
 
Q