PyTorch to CoreML Model inaccuracy

I am currently working on a 2D pose estimator. I developed a PyTorch vision transformer based model with 17 joints in COCO format for the same and then converted it to CoreML using CoreML tools version 6.2.

The model was trained on a custom dataset. However, upon running the converted model on iOS, I observed a significant drop in accuracy. You can see it in this video (https://youtu.be/EfGFrOZQGtU) that demonstrates the outputs of the PyTorch model (on the left) and the CoreML model (on the right).

Could you please confirm if this drop in accuracy is expected and suggest any possible solutions to address this issue? Please note that all preprocessing and post-processing techniques remain consistent between the models.

P.S. While converting I also got the following warning. :

TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if x.numel() == 0 and obsolete_torch_version(TORCH_VERSION, (1, 4)):

P.P.S. When we initialize the CoreML model on iOS 17.0, we get this error:

Validation failure: Invalid Pool kernel width (13), must be [1-8] or 20.
Validation failure: Invalid Pool kernel width (9), must be [1-8] or 20.
Validation failure: Invalid Pool kernel width (13), must be [1-8] or 20.
Validation failure: Invalid Pool kernel width (9), must be [1-8] or 20.
Validation failure: Invalid Pool kernel width (13), must be [1-8] or 20.
This neural network model does not have a parameter for requested key 'precisionRecallCurves'. Note: only updatable neural network models can provide parameter values and these values are only accessible in the context of an MLUpdateTask completion or progress handler.
Answered by Engineer in 790004022

Hi krissb78,

I watched the video and Core ML seems to be doing ok mostly, so I believe the PyTorch model is translated correctly to Core ML format.

Now about accuracy, (assuming you converted from PyTorch by coremltools) you may try compute_precision=ct.precision.FLOAT32 in your ct.convert call and see if it gets better. If so, then probably it's because your model is not fp16-representable.

The tracer warning is from PyTorch. Looks like your model has control flow which is may not be fully supported by torch.jit.trace? If so, please try to modify your model to not rely on control flow (e.g. x.numel() == 0 means scalar and most PyTorch ops should support scalar and tensor alike)

If you do find anything wrong with coremltools, please feel free to file an issue at https://github.com/apple/coremltools/issues

Hi krissb78,

I watched the video and Core ML seems to be doing ok mostly, so I believe the PyTorch model is translated correctly to Core ML format.

Now about accuracy, (assuming you converted from PyTorch by coremltools) you may try compute_precision=ct.precision.FLOAT32 in your ct.convert call and see if it gets better. If so, then probably it's because your model is not fp16-representable.

The tracer warning is from PyTorch. Looks like your model has control flow which is may not be fully supported by torch.jit.trace? If so, please try to modify your model to not rely on control flow (e.g. x.numel() == 0 means scalar and most PyTorch ops should support scalar and tensor alike)

If you do find anything wrong with coremltools, please feel free to file an issue at https://github.com/apple/coremltools/issues

Thanks @Engineer for taking a look.

As suggested we used a FLOAT32 compute_precision and it didn't help meaning that we still saw jitteriness on device inference versus Pytorch inference on a server.

** One interesting fact ** - we took the Core ML model (that was originally converted from PyTorch using coremltools) and ran it in Python using Core ML tools' libraries and didn't encounter the jitteriness! So our best guess at this stage is that it has something to do with running the CoreML model on the device itself (tested on iPhone 14/iOS 17 and M2 Mac OS Sonoma 14.5 as an iPad app).

Do you think the above noted warning copied below has something to do with it?

Validation failure: Invalid Pool kernel width (9), must be [1-8] or 20.
Validation failure: Invalid Pool kernel width (13), must be [1-8] or 20.
Validation failure: Invalid Pool kernel width (9), must be [1-8] or 20.
Validation failure: Invalid Pool kernel width (13), must be [1-8] or 20.
This neural network model does not have a parameter for requested key 'precisionRecallCurves'. Note: only updatable neural network models can provide parameter values and these values are only accessible in the context of an MLUpdateTask completion or progress handler.


PyTorch to CoreML Model inaccuracy
 
 
Q