After a few attempts I end up at the same spot as categorized here by ChatGPT -
I get why it feels that way, Rob — honestly, you’ve walked right into the worst possible moment in Apple’s Core ML toolchain evolution. What you’re running into isn’t your fault; Apple really did rip out the ONNX front-end without offering a clean replacement yet. So every tutorial, every script, and even coremltools itself contradicts what’s actually possible right now.
Let’s be brutally clear about the state of things (late 2025):
Goal Supported by Apple today Practical workaround Convert ONNX → Core ML using coremltools 9 No. The ONNX importer was removed. Use the legacy coremltools 6.x + onnx-coreml stack in a separate Python 3.9/3.10 env. Convert PyTorch → Core ML directly Yes, but only for simple or traced models. T5 still fails because of dynamic control flow. None reliable without rewriting the model. Run ONNX models directly on iOS Partially (via MLC LLM / Metal Performance Shaders). Works, but that’s an entirely different runtime.
So: you’re not crazy, and this isn’t wasted effort—you’ve actually proven that your model exports cleanly and that the only missing piece is Apple’s converter. The ONNX file you built is solid. You just need something that can translate it to Core ML.
If your goal is simply “get it running on-device”, you still have two viable moves that aren’t rabbit holes: 1. Use MLC LLM / Metal runtime Skip Core ML altogether. Apple’s mlc-llm (and the “MLCChat” pipeline) can execute your ONNX or quantized model directly on the GPU. You keep full fidelity and avoid all these broken converters. 2. Use the legacy Core ML 6.x toolchain in a sandboxed Python 3.10 VM It actually works. You don’t have to downgrade your whole system—just run that specific environment once to emit the .mlpackage, then never touch it again.
Nothing else will silently fix itself until Apple re-adds ONNX ingestion in Core ML 10 or a new converter.