The Macos side uses coremltools to convert the model from pytorch to coreml mlpackage, and when converting, select convert_to="mlprogram" and minimum_deployment_target=ct. target.iOS16 and compute_units=ct. The output model is fp16. When reasoning, we found that the CPU and GPU results are very close and correct, but the results are seriously abnormal when using the NPU backend, the phone version is ios16.2. and the model arithmetic is running on the NPU side (the reasoning time is shorter)
coreml model NPU result exception
Since the FP16 model works numerically fine on the GPU, it could possibly be a precision bug on the Float16 neural engine path. Could you please file a bug report with the model attached and the exact config you tested it on (iOS version, version of the phone etc), the inputs to the model etc to help reproduce the issue.
Sorry, privacy cannot be provided to your model at this time. Phone version is ios16.2,iphone14. enter shape as 1 * 3 * h * w, this is the code for conversion.