Integrate machine learning models into your app using Core ML.

Core ML Documentation

Posts under Core ML subtopic

Post

Replies

Boosts

Views

Activity

CoreML regression between macOS 26.0.1 and macOS 26.1 Beta causing scrambled tensor outputs
We’ve encountered what appears to be a CoreML regression between macOS 26.0.1 and macOS 26.1 Beta. In macOS 26.0.1, CoreML models run and produce correct results. However, in macOS 26.1 Beta, the same models produce scrambled or corrupted outputs, suggesting that tensor memory is being read or written incorrectly. The behavior is consistent with a low-level stride or pointer arithmetic issue — for example, using 16-bit strides on 32-bit data or other mismatches in tensor layout handling. Reproduction Install ON1 Photo RAW 2026 or ON1 Resize 2026 on macOS 26.0.1. Use the newest Highest Quality resize model, which is Stable Diffusion–based and runs through CoreML. Observe correct, high-quality results. Upgrade to macOS 26.1 Beta and run the same operation again. The output becomes visually scrambled or corrupted. We are also seeing similar issues with another Stable Diffusion UNet model that previously worked correctly on macOS 26.0.1. This suggests the regression may affect multiple diffusion-style architectures, likely due to a change in CoreML’s tensor stride, layout computation, or memory alignment between these versions. Notes The affected models are exported using standard CoreML conversion pipelines. No custom operators or third-party CoreML runtime layers are used. The issue reproduces consistently across multiple machines. It would be helpful to know if there were changes to CoreML’s tensor layout, precision handling, or MLCompute backend between macOS 26.0.1 and 26.1 Beta, or if this is a known regression in the current beta.
8
4
2.2k
1w
Regression in EnumeratedShaped support in recent MacOS release
Hi, unfortunately I am not able to verify this but I remember some time ago I was able to create CoreML models that had one (or more) inputs with an enumerated shape size, and one (or more) inputs with a static shape. This was some months ago. Since then I updated my MacOS to Sequoia 15.5, and when I try to execute MLModels with this setup I get the following error libc++abi: terminating due to uncaught exception of type CoreML::MLNeuralNetworkUtilities::AsymmetricalEnumeratedShapesException: A model doesn't allow input features with enumerated flexibility to have unequal number of enumerated shapes, but input feature global_write_indices has 1 enumerated shapes and input feature input_hidden_states has 3 enumerated shapes. It may make sense (but not really though) to verify that for inputs with a flexible enumerated shape they all have the same number of possible shapes is the same, but this should not impede the possibility of also having static shape inputs with a single shape defined alongside the flexible shape inputs.
6
1
291
May ’25
Core ML Model Performance report shows prediction speed much faster than actual app runs
Hi all, I'm tuning my app prediction speed with Core ML model. I watched and tried the methods in video: Improve Core ML integration with async prediction and Optimize your Core ML usage. I also use instruments to look what's the bottleneck that my prediction speed cannot be faster. Below is the instruments result with my app. its prediction duration is 10.29ms And below is performance report shows the average speed of prediction is 5.55ms, that is about half time of my app prediction! Below is part of my instruments records. I think the prediction should be considered quite frequent. Could it be faster? How to be the same prediction speed as performance report? The prediction speed on macbook Pro M2 is nearly the same as macbook Air M1!
5
0
1.4k
Oct ’25
coreml Fetching decryption key from server failed
My iOS app supports iOS 18, and I’m using an encrypted CoreML model secured with a key generated from Xcode. Every few months (around every 3 months), the encrypted model fails to load for both me and my users. When I investigate, I find this error: coreml Fetching decryption key from server failed: noEntryFound("No records found"). Make sure the encryption key was generated with correct team ID To temporarily fix it, I delete the old key, generate a new one, re-encrypt the model, and submit an app update. This resolves the issue, but only for a while. This is a terrible experience for users and obviously not a sustainable solution. I want to understand: Why is this happening? Is there a known expiration or invalidation policy for CoreML encryption keys? How can I prevent this issue permanently? Any insights or official guidance would be really appreciated.
5
2
673
Jul ’25
Massive CoreML latency spike on live AVFoundation camera feed vs. offline inference (CPU+ANE)
Hello, I’m experiencing a severe performance degradation when running CoreML models on a live AVFoundation video feed compared to offline or synthetic inference. This happens across multiple models I've converted (including SCI, RTMPose, and RTMW) and affects multiple devices. The Environment OS: macOS 26.3, iOS 26.3, iPadOS 26.3 Hardware: Mac14,6 (M2 Max), iPad Pro 11 M1, iPhone 13 mini Compute Units: cpuAndNeuralEngine The Numbers When testing my SCI_output_image_int8.mlpackage model, the inference timings are drastically different: Synthetic/Offline Inference: ~1.34 ms Live Camera Inference: ~15.96 ms Preprocessing is completely ruled out as the bottleneck. My profiling shows total preprocessing (nearest-neighbor resize + feature provider creation) takes only ~0.4 ms in camera mode. Furthermore, no frames are being dropped. What I've Tried I am building a latency-critical app and have implemented almost every recommended optimization to try and fix this, but the camera-feed penalty remains: Matched the AVFoundation camera output format exactly to the model input (640x480 at 30/60fps). Used IOSurface-backed pixel buffers for everything (camera output, synthetic buffer, and resize buffer). Enabled outputBackings. Loaded the model once and reused it for all predictions. Configured MLModelConfiguration with reshapeFrequency = .frequent and specializationStrategy = .fastPrediction. Wrapped inference in ProcessInfo.processInfo.beginActivity(options: .latencyCritical, reason: "CoreML_Inference"). Set DispatchQueue to qos: .userInteractive. Disabled the idle timer and enabled iOS Game Mode. Exported models using coremltools 9.0 (deployment target iOS 26) with ImageType inputs/outputs and INT8 quantization. Reproduction To completely rule out UI or rendering overhead, I wrote a standalone Swift CLI script that isolates the AVFoundation and CoreML pipeline. The script clearly demonstrates the ~15ms latency on live camera frames versus the ~1ms latency on synthetic buffers. (I have attached camera_coreml_benchmark.swift and coreml model (very light low light enghancement model) to this repo on github https://github.com/pzoltowski/apple-coreml-camera-latency-repro). My Question: Is this massive overhead expected behavior for AVFoundation + Core ML on live feeds, or is this a framework/runtime bug? If expected, what is the Apple-recommended pattern to bypass this camera-only inference slowdown? One think found interesting when running in debug model was faster (not as fast as in performance benchmark but faster than 16ms. Also somehow if I did some dummy calculation on on different DispatchQueue also seems like model got slightly faster. So maybe its related to ANE Power State issues (Jitter/SoC Wake) and going to fast to sleep and taking a long time to wakeup? Doing dummy calculation in background thought is probably not a solution. Thanks in advance for any insights!
5
0
724
1w
Error when open mlpackage with XCode
Hello, I'm trying to write a model with PyTorch and convert it to CoreML. I wrote another models and that works succesfully, even the one that gave the problem is, but I can't visualize it with XCode to know where is running. The error that appear is: There was a problem decoding this Core ML document validator error: unable to open file for read Anyone knows why is this happening? Thanks a lot, Álvaro Corrochano
3
0
252
Apr ’25
CoreML MLModelErrorModelDecryption error
Somehow I'm not able to decrypt our ml models on my machine. It does not matter: If I clean the build / delete the build folder If it's a local build or a build downloaded from our build server I log in as a different user I reboot my system (15.4.1 (24E263) I use a different network Re-generate the encryption keys. I'm the only one in my team confronted with this issue. Using the encrypted models works fine for everyone else. As soon as our application tries to load the bundled ml model the following error is logged and returned: Could not create persistent key blob for CD49E04F-1A42-4FBE-BFC1-2576B89EC233 : error=Error Domain=com.apple.CoreML Code=9 "Failed to generate key request for CD49E04F-1A42-4FBE-BFC1-2576B89EC233 with error: -42908" Error code 9 points to a decryption issue, but offers no useful pointers and suggests that some sort of network request needs to be made in order to decrypt our models. /*! Core ML throws/returns this error when the framework encounters an error in the model decryption subsystem. The typical cause for this error is in the key server configuration and the client application cannot do much about it. For example, a model loading method will throw/return the error when it uses incorrect model decryption key. */ MLModelErrorModelDecryption API_AVAILABLE(macos(11.0), ios(14.0), watchos(7.0), tvos(14.0)) = 9, I could not find a reference to error '-42908' anywhere. ChatGPT just lied to me, as usual... How do can I resolve this or diagnose this further? Thanks.
3
0
248
May ’25
Crash inside of Vision predictWithCVPixelBuffer - Crashed: com.apple.VN.detectorSyncTasksQueue.VNCoreMLTransformer
Hello, We have been encountering a persistent crash in our application, which is deployed exclusively on iPad devices. The crash occurs in the following code block: let requestHandler = ImageRequestHandler(paddedImage) var request = CoreMLRequest(model: model) request.cropAndScaleAction = .scaleToFit let results = try await requestHandler.perform(request) The client using this code is wrapped inside an actor, following Swift concurrency principles. The issue has been consistently reproduced across multiple iPadOS versions, including: iPad OS - 18.4.0 iPad OS - 18.4.1 iPad OS - 18.5.0 This is the crash log - Crashed: com.apple.VN.detectorSyncTasksQueue.VNCoreMLTransformer 0 libobjc.A.dylib 0x7b98 objc_retain + 16 1 libobjc.A.dylib 0x7b98 objc_retain_x0 + 16 2 libobjc.A.dylib 0xbf18 objc_getProperty + 100 3 Vision 0x326300 -[VNCoreMLModel predictWithCVPixelBuffer:options:error:] + 148 4 Vision 0x3273b0 -[VNCoreMLTransformer processRegionOfInterest:croppedPixelBuffer:options:qosClass:warningRecorder:error:progressHandler:] + 748 5 Vision 0x2ccdcc __119-[VNDetector internalProcessUsingQualityOfServiceClass:options:regionOfInterest:warningRecorder:error:progressHandler:]_block_invoke_5 + 132 6 Vision 0x14600 VNExecuteBlock + 80 7 Vision 0x14580 __76+[VNDetector runSuccessReportingBlockSynchronously:detector:qosClass:error:]_block_invoke + 56 8 libdispatch.dylib 0x6c98 _dispatch_block_sync_invoke + 240 9 libdispatch.dylib 0x1b584 _dispatch_client_callout + 16 10 libdispatch.dylib 0x11728 _dispatch_lane_barrier_sync_invoke_and_complete + 56 11 libdispatch.dylib 0x7fac _dispatch_sync_block_with_privdata + 452 12 Vision 0x14110 -[VNControlledCapacityTasksQueue dispatchSyncByPreservingQueueCapacity:] + 60 13 Vision 0x13ffc +[VNDetector runSuccessReportingBlockSynchronously:detector:qosClass:error:] + 324 14 Vision 0x2ccc80 __119-[VNDetector internalProcessUsingQualityOfServiceClass:options:regionOfInterest:warningRecorder:error:progressHandler:]_block_invoke_4 + 336 15 Vision 0x14600 VNExecuteBlock + 80 16 Vision 0x2cc98c __119-[VNDetector internalProcessUsingQualityOfServiceClass:options:regionOfInterest:warningRecorder:error:progressHandler:]_block_invoke_3 + 256 17 libdispatch.dylib 0x1b584 _dispatch_client_callout + 16 18 libdispatch.dylib 0x6ab0 _dispatch_block_invoke_direct + 284 19 Vision 0x2cc454 -[VNDetector internalProcessUsingQualityOfServiceClass:options:regionOfInterest:warningRecorder:error:progressHandler:] + 632 20 Vision 0x2cd14c __111-[VNDetector processUsingQualityOfServiceClass:options:regionOfInterest:warningRecorder:error:progressHandler:]_block_invoke + 124 21 Vision 0x14600 VNExecuteBlock + 80 22 Vision 0x2ccfbc -[VNDetector processUsingQualityOfServiceClass:options:regionOfInterest:warningRecorder:error:progressHandler:] + 340 23 Vision 0x125410 __swift_memcpy112_8 + 4852 24 libswift_Concurrency.dylib 0x5c134 swift::runJobInEstablishedExecutorContext(swift::Job*) + 292 25 libswift_Concurrency.dylib 0x5d5c8 swift_job_runImpl(swift::Job*, swift::SerialExecutorRef) + 156 26 libdispatch.dylib 0x13db0 _dispatch_root_queue_drain + 364 27 libdispatch.dylib 0x1454c _dispatch_worker_thread2 + 156 28 libsystem_pthread.dylib 0x9d0 _pthread_wqthread + 232 29 libsystem_pthread.dylib 0xaac start_wqthread + 8 We found an issue similar to us - https://developer.apple.com/forums/thread/770771. But the crash logs are quite different, we believe this warrants further investigation to better understand the root cause and potential mitigation strategies. Please let us know if any additional information would help diagnose this issue.
3
0
422
Jul ’25
Core ML .mlpackage not found in bundle despite target membership and Copy Bundle Resources
Hi everyone, I’m working on an iOS app that uses a Core ML model to run live image recognition. I’ve run into a persistent issue with the mlpackage not being turned into a swift class. This following error is in the code, and in carDetection.mlpackage, it says that model class has not been generated yet. The error in the code is as follows: What I’ve tried: Verified Target Membership is checked for carDetectionModel.mlpackage Confirmed the file is listed under Copy Bundle Resources (and removed from Compile Sources) Cleaned the build folder (Shift + Cmd + K) and rebuilt Renamed and re-added the .mlpackage file Restarted Xcode and re-added the file Logged bundle contents at runtime, but the .mlpackage still doesn’t appear The mlpackage is in Copy bundle resources, and is not in the compile sources. I just don't know why a swift class is not being generated for the mlpackage. Could someone please give me some guidance on what to do to resolve this issue? Sorry if my error is a bit naive, I'm pretty new to iOS app development
3
0
583
Dec ’25
Compatibility issue of TensorFlow-metal with PyArrow
Overview I'm experiencing a critical issue where TensorFlow-metal and PyArrow seem to be incompatible when installed together in the same environment. Whenever both packages are present, TensorFlow crashes and the kernel dies during execution. Environment Details Environment Details macOS Version: 15.3.2 Mac Model: MacBook Pro Max M3 Python Version: 3.11 TensorFlow Version: 2.19 PyArrow Version: 19.0.0 Issue Description: When both TensorFlow-metal and PyArrow are installed in the same Python environment, any attempt to use TensorFlow results in immediate kernel crashes. The issue appears to be a compatibility problem between these two packages rather than a problem with either package individually. Steps to Reproduce Create a new Python environment: conda create -n tf-metal python=3.11 Install TensorFlow-metal: pip install tensorflow tensorflow-metal Install PyArrow: pip install pyarrow Run the following minimal example: # Create a simple model model = tf.keras.Sequential([ tf.keras.layers.Input(shape=(2,)), tf.keras.layers.Dense(1) ]) model.compile(optimizer='adam', loss='mse') model.summary() # This works fine # Generate some dummy data X = np.random.random((100, 2)) y = np.random.random((100, 1)) # The crash happens exactly at this line model.fit(X, y, epochs=5, batch_size=32) # CRASH: Kernel dies here Result: Kernel crashes with no error message What I've Tried Reinstalling both packages in different orders Using different versions of both packages Creating isolated environments Checking system logs for additional error information The only workaround I've found is to use separate environments for each package, which isn't practical for my workflow as I need both libraries for my data processing and machine learning pipeline. Questions Has anyone else encountered this specific compatibility issue? Are there known workarounds that allow both packages to coexist? Is this a known issue that's being addressed in upcoming releases? Any insights, suggestions, or assistance would be greatly appreciated. I'm happy to provide any additional information that might help diagnose this problem. Thank you in advance for your help! Thank you in advance for your help!
2
0
143
May ’25
A specific mlmodelc model runs on iPhone 15, but not on iPhone 16
As we described on the title, the model that I have built completely works on iPhone 15 / A16 Bionic, on the other hand it does not run on iPhone 16 / A18 chip with the following error message. E5RT encountered an STL exception. msg = MILCompilerForANE error: failed to compile ANE model using ANEF. Error=_ANECompiler : ANECCompile() FAILED. E5RT: MILCompilerForANE error: failed to compile ANE model using ANEF. Error=_ANECompiler : ANECCompile() FAILED (11) It consumes 1.5 ~ 1.6 GB RAM on the loading the model, then the consumption is decreased to less than 100MB on the both of iPhone 15 and 16. After that, only on iPhone 16, the above error is shown on the Xcode log, the memory consumption is surged to 5 to 6GB, and the system kills the app. It works well only on iPhone 15. This model is built with the Core ML tools. Until now, I have tried the target iOS 16 to 18 and the compute units of CPU_AND_NE and ALL. But any ways have not solved this issue. Eventually, what kindof fix should I do? minimum_deployment_target = ct.target.iOS18 compute_units = ct.ComputeUnit.ALL compute_precision = ct.precision.FLOAT16
2
0
228
May ’25
Is there an API to check if a Core ML compiled model is already cached?
Hello Apple Developer Community, I'm investigating Core ML model loading behavior and noticed that even when the compiled model path remains unchanged after an APP update, the first run still triggers an "uncached load" process. This seems to impact user experience with unnecessary delays. Question: Does Core ML provide any public API to check whether a compiled model (from a specific .mlmodelc path) is already cached in the system? If such API exists, we'd like to use it for pre-loading decision logic - only perform background pre-load when the model isn't cached. Has anyone encountered similar scenarios or found official solutions? Any insights would be greatly appreciated!
2
0
255
May ’25
Memory stride warning when loading CoreML models on ANE
When I am doing an uncached load of CoreML model on ANE, I received this warning in Xcode console Type of hiddenStates in function main's I/O contains unknown strides. Using unknown strides for MIL tensor buffers with unknown shapes is not recommended in E5ML. Please use row_alignment_in_bytes property instead. Refer to https://e5-ml.apple.com/more-info/memory-layouts.html for more information. However, the web link does not seem to be working. Where can I find more information about about this and how can I fix it?
2
0
642
1w
Core ML model decryption on Intel chips
About the Core ML model encryption mention in:https://developer.apple.com/documentation/coreml/encrypting-a-model-in-your-app When I encrypted the model, if the machine is M chip, the model will load perfectly. One the other hand, when I test the executable on an Intel chip macbook, there will be an error: Error Domain=com.apple.CoreML Code=9 "Operation not supported on this platform." UserInfo={NSLocalizedDescription=Operation not supported on this platform.} Intel test machine is 2019 macbook air with CPU: Intel i5-8210Y, OS: 14.7.6 23H626, With Apple T2 Security Chip. The encrypted model do load on M2 and M4 macbook air. If the model is NOT encrypted, it will also load on the Intel test machine. I did not find in Core ML document that suggest if the encryption/decryption support Intel chips. May I check if the decryption indeed does NOT support Intel chip?
2
1
423
Jan ’26
CoreML Inference Acceleration
Hello everyone, I have a visual convolutional model and a video that has been decoded into many frames. When I perform inference on each frame in a loop, the speed is a bit slow. So, I started 4 threads, each running inference simultaneously, but I found that the speed is the same as serial inference, every single forward inference is slower. I used the mactop tool to check the GPU utilization, and it was only around 20%. Is this normal? How can I accelerate it?
2
0
710
Sep ’25
Unable to load a quantized Qwen 1.7B model on an iPhone SE 3
I am trying to benchmark and see if the Qwen3 1.7B model can run in an iPhone SE 3 [4 GB RAM]. My core problem is - Even with weight quantization the SE 3 is not able to load into memory. What I've tried: I am converting a Torch model to the Core ML format using coremltools. I have tried the following combinations of quantization and context length 8 bit + 1024 8 bit + 2048 4 bit + 1024 4 bit + 2048 All the above quantizations are done with dynamic shape with the default being [1,1] in the hope that the whole context length does not get allocated in memory The 4-bit model is approximately 865MB on disk The 8-bit model is approximately 1.7 GB on disk During load: With the int4 quantization the memory spikes during intitial load a lot. Could this be because many operations are converted to int8 or fp16 as core ML does not perform operations natively on int4? With int8 on the profiler the memory does not go above 2 GB (only 900 MB) but it is still not able to load as it shows the following error. 2GB is the limit where jetsam kills the app for the iPhone SE 3 E5RT: Error(s) occurred compiling MIL to BNNS graph: [CreateBnnsGraphProgramFromMIL]: BNNS Graph Compile: failed to preallocate file with error: No space left on device for path: /var/mobile/Containers/Data/Application/ 5B8BB7D2-06A6-4BAE-A042-407B6D805E7C/Library/Caches /com.tss.qwen3-coreml/ com.apple.e5rt.e5bundlecache/ 23A341/<long key>.tmp.12586_4362093968.bundle/ H14.bundle/main/main_bnns/bnns_program.bnnsir Some online sources have suggested activation quantization but I am unsure if that will have any impact on loading [as the spike is during load and not inference] The model spec also suggests that there is no dequantization happening (for e.g from 4 bit -> fp16) So I had couple of queries: Has anyone faced similar issues? What could be the reasons for the temporary memory spike during LOAD What are approaches that can be adopted to deal with this issue? Any help would be greatly appreciated. Thank you.
2
0
235
Mar ’26
How can I change the output dimensions of a CoreML model in Xcode when the outputs come from a NonMaximumSuppression layer?
After exerting a custom model with nms=True. In Xcode, the outputs show as: confidence: MultiArray (0 × 5) coordinates: MultiArray (0 × 4) I want to set fixed shapes (e.g., 100 × 5, 100 × 4), but Xcode does not allow editing—the shape fields are locked. The model graph shows both outputs come directly from a NonMaximumSuppression layer. Is it possible to set fixed output dimensions for NMS outputs in CoreML?
2
0
236
Mar ’26
tensorflow-metal ReLU activation fails to clip negative values on M4 Apple Silicon
Environment: Hardware: Mac M4 OS: macOS Sequoia 15.7.4 TensorFlow-macOS Version: 2.16.2 TensorFlow-metal Version: 1.2.0 Description: When using the tensorflow-metal plug-in for GPU acceleration on M4, the ReLU activation function (both as a layer and as an activation argument) fails to correctly clip negative values to zero. The same code works correctly when forced to run on the CPU. Reproduction Script: import os import numpy as np import tensorflow as tf # weights and biases = -1 weights = [np.ones((10, 5)) * -1, np.ones(5) * -1] # input = 1 data = np.ones((1, 10)) # comment this line => GPU => get negative values # uncomment this line => CPU => no negative values # tf.config.set_visible_devices([], 'GPU') # create model model = tf.keras.Sequential([ tf.keras.layers.Input(shape=(10,)), tf.keras.layers.Dense(5, activation='relu') ]) # set weights model.layers[0].set_weights(weights) # get output output = model.predict(data) # check if negative is present print(f"min value: {output.min()}") print(f"is negative present? {np.any(output < 0)}")
2
0
416
3w
CoreML regression between macOS 26.0.1 and macOS 26.1 Beta causing scrambled tensor outputs
We’ve encountered what appears to be a CoreML regression between macOS 26.0.1 and macOS 26.1 Beta. In macOS 26.0.1, CoreML models run and produce correct results. However, in macOS 26.1 Beta, the same models produce scrambled or corrupted outputs, suggesting that tensor memory is being read or written incorrectly. The behavior is consistent with a low-level stride or pointer arithmetic issue — for example, using 16-bit strides on 32-bit data or other mismatches in tensor layout handling. Reproduction Install ON1 Photo RAW 2026 or ON1 Resize 2026 on macOS 26.0.1. Use the newest Highest Quality resize model, which is Stable Diffusion–based and runs through CoreML. Observe correct, high-quality results. Upgrade to macOS 26.1 Beta and run the same operation again. The output becomes visually scrambled or corrupted. We are also seeing similar issues with another Stable Diffusion UNet model that previously worked correctly on macOS 26.0.1. This suggests the regression may affect multiple diffusion-style architectures, likely due to a change in CoreML’s tensor stride, layout computation, or memory alignment between these versions. Notes The affected models are exported using standard CoreML conversion pipelines. No custom operators or third-party CoreML runtime layers are used. The issue reproduces consistently across multiple machines. It would be helpful to know if there were changes to CoreML’s tensor layout, precision handling, or MLCompute backend between macOS 26.0.1 and 26.1 Beta, or if this is a known regression in the current beta.
Replies
8
Boosts
4
Views
2.2k
Activity
1w
Regression in EnumeratedShaped support in recent MacOS release
Hi, unfortunately I am not able to verify this but I remember some time ago I was able to create CoreML models that had one (or more) inputs with an enumerated shape size, and one (or more) inputs with a static shape. This was some months ago. Since then I updated my MacOS to Sequoia 15.5, and when I try to execute MLModels with this setup I get the following error libc++abi: terminating due to uncaught exception of type CoreML::MLNeuralNetworkUtilities::AsymmetricalEnumeratedShapesException: A model doesn't allow input features with enumerated flexibility to have unequal number of enumerated shapes, but input feature global_write_indices has 1 enumerated shapes and input feature input_hidden_states has 3 enumerated shapes. It may make sense (but not really though) to verify that for inputs with a flexible enumerated shape they all have the same number of possible shapes is the same, but this should not impede the possibility of also having static shape inputs with a single shape defined alongside the flexible shape inputs.
Replies
6
Boosts
1
Views
291
Activity
May ’25
Core ML Model Performance report shows prediction speed much faster than actual app runs
Hi all, I'm tuning my app prediction speed with Core ML model. I watched and tried the methods in video: Improve Core ML integration with async prediction and Optimize your Core ML usage. I also use instruments to look what's the bottleneck that my prediction speed cannot be faster. Below is the instruments result with my app. its prediction duration is 10.29ms And below is performance report shows the average speed of prediction is 5.55ms, that is about half time of my app prediction! Below is part of my instruments records. I think the prediction should be considered quite frequent. Could it be faster? How to be the same prediction speed as performance report? The prediction speed on macbook Pro M2 is nearly the same as macbook Air M1!
Replies
5
Boosts
0
Views
1.4k
Activity
Oct ’25
Foundation Models Framework with specialized models
Hello folks! Taking a look at https://developer.apple.com/documentation/foundationmodels it’s not clear how to use another models there. Do anyone knows if it’s possible use one trained model from outside (imported) here in foundation models framework? Thanks!
Replies
5
Boosts
0
Views
1.1k
Activity
Oct ’25
coreml Fetching decryption key from server failed
My iOS app supports iOS 18, and I’m using an encrypted CoreML model secured with a key generated from Xcode. Every few months (around every 3 months), the encrypted model fails to load for both me and my users. When I investigate, I find this error: coreml Fetching decryption key from server failed: noEntryFound("No records found"). Make sure the encryption key was generated with correct team ID To temporarily fix it, I delete the old key, generate a new one, re-encrypt the model, and submit an app update. This resolves the issue, but only for a while. This is a terrible experience for users and obviously not a sustainable solution. I want to understand: Why is this happening? Is there a known expiration or invalidation policy for CoreML encryption keys? How can I prevent this issue permanently? Any insights or official guidance would be really appreciated.
Replies
5
Boosts
2
Views
673
Activity
Jul ’25
Massive CoreML latency spike on live AVFoundation camera feed vs. offline inference (CPU+ANE)
Hello, I’m experiencing a severe performance degradation when running CoreML models on a live AVFoundation video feed compared to offline or synthetic inference. This happens across multiple models I've converted (including SCI, RTMPose, and RTMW) and affects multiple devices. The Environment OS: macOS 26.3, iOS 26.3, iPadOS 26.3 Hardware: Mac14,6 (M2 Max), iPad Pro 11 M1, iPhone 13 mini Compute Units: cpuAndNeuralEngine The Numbers When testing my SCI_output_image_int8.mlpackage model, the inference timings are drastically different: Synthetic/Offline Inference: ~1.34 ms Live Camera Inference: ~15.96 ms Preprocessing is completely ruled out as the bottleneck. My profiling shows total preprocessing (nearest-neighbor resize + feature provider creation) takes only ~0.4 ms in camera mode. Furthermore, no frames are being dropped. What I've Tried I am building a latency-critical app and have implemented almost every recommended optimization to try and fix this, but the camera-feed penalty remains: Matched the AVFoundation camera output format exactly to the model input (640x480 at 30/60fps). Used IOSurface-backed pixel buffers for everything (camera output, synthetic buffer, and resize buffer). Enabled outputBackings. Loaded the model once and reused it for all predictions. Configured MLModelConfiguration with reshapeFrequency = .frequent and specializationStrategy = .fastPrediction. Wrapped inference in ProcessInfo.processInfo.beginActivity(options: .latencyCritical, reason: "CoreML_Inference"). Set DispatchQueue to qos: .userInteractive. Disabled the idle timer and enabled iOS Game Mode. Exported models using coremltools 9.0 (deployment target iOS 26) with ImageType inputs/outputs and INT8 quantization. Reproduction To completely rule out UI or rendering overhead, I wrote a standalone Swift CLI script that isolates the AVFoundation and CoreML pipeline. The script clearly demonstrates the ~15ms latency on live camera frames versus the ~1ms latency on synthetic buffers. (I have attached camera_coreml_benchmark.swift and coreml model (very light low light enghancement model) to this repo on github https://github.com/pzoltowski/apple-coreml-camera-latency-repro). My Question: Is this massive overhead expected behavior for AVFoundation + Core ML on live feeds, or is this a framework/runtime bug? If expected, what is the Apple-recommended pattern to bypass this camera-only inference slowdown? One think found interesting when running in debug model was faster (not as fast as in performance benchmark but faster than 16ms. Also somehow if I did some dummy calculation on on different DispatchQueue also seems like model got slightly faster. So maybe its related to ANE Power State issues (Jitter/SoC Wake) and going to fast to sleep and taking a long time to wakeup? Doing dummy calculation in background thought is probably not a solution. Thanks in advance for any insights!
Replies
5
Boosts
0
Views
724
Activity
1w
Slow inference speed after my core ml model was encrypted
Hi friends, I have just found that the inference speed dropped to only 1/10 of the original model. Had anyone encountered this? Thank you.
Replies
4
Boosts
0
Views
161
Activity
Apr ’25
Error when open mlpackage with XCode
Hello, I'm trying to write a model with PyTorch and convert it to CoreML. I wrote another models and that works succesfully, even the one that gave the problem is, but I can't visualize it with XCode to know where is running. The error that appear is: There was a problem decoding this Core ML document validator error: unable to open file for read Anyone knows why is this happening? Thanks a lot, Álvaro Corrochano
Replies
3
Boosts
0
Views
252
Activity
Apr ’25
CoreML MLModelErrorModelDecryption error
Somehow I'm not able to decrypt our ml models on my machine. It does not matter: If I clean the build / delete the build folder If it's a local build or a build downloaded from our build server I log in as a different user I reboot my system (15.4.1 (24E263) I use a different network Re-generate the encryption keys. I'm the only one in my team confronted with this issue. Using the encrypted models works fine for everyone else. As soon as our application tries to load the bundled ml model the following error is logged and returned: Could not create persistent key blob for CD49E04F-1A42-4FBE-BFC1-2576B89EC233 : error=Error Domain=com.apple.CoreML Code=9 "Failed to generate key request for CD49E04F-1A42-4FBE-BFC1-2576B89EC233 with error: -42908" Error code 9 points to a decryption issue, but offers no useful pointers and suggests that some sort of network request needs to be made in order to decrypt our models. /*! Core ML throws/returns this error when the framework encounters an error in the model decryption subsystem. The typical cause for this error is in the key server configuration and the client application cannot do much about it. For example, a model loading method will throw/return the error when it uses incorrect model decryption key. */ MLModelErrorModelDecryption API_AVAILABLE(macos(11.0), ios(14.0), watchos(7.0), tvos(14.0)) = 9, I could not find a reference to error '-42908' anywhere. ChatGPT just lied to me, as usual... How do can I resolve this or diagnose this further? Thanks.
Replies
3
Boosts
0
Views
248
Activity
May ’25
Crash inside of Vision predictWithCVPixelBuffer - Crashed: com.apple.VN.detectorSyncTasksQueue.VNCoreMLTransformer
Hello, We have been encountering a persistent crash in our application, which is deployed exclusively on iPad devices. The crash occurs in the following code block: let requestHandler = ImageRequestHandler(paddedImage) var request = CoreMLRequest(model: model) request.cropAndScaleAction = .scaleToFit let results = try await requestHandler.perform(request) The client using this code is wrapped inside an actor, following Swift concurrency principles. The issue has been consistently reproduced across multiple iPadOS versions, including: iPad OS - 18.4.0 iPad OS - 18.4.1 iPad OS - 18.5.0 This is the crash log - Crashed: com.apple.VN.detectorSyncTasksQueue.VNCoreMLTransformer 0 libobjc.A.dylib 0x7b98 objc_retain + 16 1 libobjc.A.dylib 0x7b98 objc_retain_x0 + 16 2 libobjc.A.dylib 0xbf18 objc_getProperty + 100 3 Vision 0x326300 -[VNCoreMLModel predictWithCVPixelBuffer:options:error:] + 148 4 Vision 0x3273b0 -[VNCoreMLTransformer processRegionOfInterest:croppedPixelBuffer:options:qosClass:warningRecorder:error:progressHandler:] + 748 5 Vision 0x2ccdcc __119-[VNDetector internalProcessUsingQualityOfServiceClass:options:regionOfInterest:warningRecorder:error:progressHandler:]_block_invoke_5 + 132 6 Vision 0x14600 VNExecuteBlock + 80 7 Vision 0x14580 __76+[VNDetector runSuccessReportingBlockSynchronously:detector:qosClass:error:]_block_invoke + 56 8 libdispatch.dylib 0x6c98 _dispatch_block_sync_invoke + 240 9 libdispatch.dylib 0x1b584 _dispatch_client_callout + 16 10 libdispatch.dylib 0x11728 _dispatch_lane_barrier_sync_invoke_and_complete + 56 11 libdispatch.dylib 0x7fac _dispatch_sync_block_with_privdata + 452 12 Vision 0x14110 -[VNControlledCapacityTasksQueue dispatchSyncByPreservingQueueCapacity:] + 60 13 Vision 0x13ffc +[VNDetector runSuccessReportingBlockSynchronously:detector:qosClass:error:] + 324 14 Vision 0x2ccc80 __119-[VNDetector internalProcessUsingQualityOfServiceClass:options:regionOfInterest:warningRecorder:error:progressHandler:]_block_invoke_4 + 336 15 Vision 0x14600 VNExecuteBlock + 80 16 Vision 0x2cc98c __119-[VNDetector internalProcessUsingQualityOfServiceClass:options:regionOfInterest:warningRecorder:error:progressHandler:]_block_invoke_3 + 256 17 libdispatch.dylib 0x1b584 _dispatch_client_callout + 16 18 libdispatch.dylib 0x6ab0 _dispatch_block_invoke_direct + 284 19 Vision 0x2cc454 -[VNDetector internalProcessUsingQualityOfServiceClass:options:regionOfInterest:warningRecorder:error:progressHandler:] + 632 20 Vision 0x2cd14c __111-[VNDetector processUsingQualityOfServiceClass:options:regionOfInterest:warningRecorder:error:progressHandler:]_block_invoke + 124 21 Vision 0x14600 VNExecuteBlock + 80 22 Vision 0x2ccfbc -[VNDetector processUsingQualityOfServiceClass:options:regionOfInterest:warningRecorder:error:progressHandler:] + 340 23 Vision 0x125410 __swift_memcpy112_8 + 4852 24 libswift_Concurrency.dylib 0x5c134 swift::runJobInEstablishedExecutorContext(swift::Job*) + 292 25 libswift_Concurrency.dylib 0x5d5c8 swift_job_runImpl(swift::Job*, swift::SerialExecutorRef) + 156 26 libdispatch.dylib 0x13db0 _dispatch_root_queue_drain + 364 27 libdispatch.dylib 0x1454c _dispatch_worker_thread2 + 156 28 libsystem_pthread.dylib 0x9d0 _pthread_wqthread + 232 29 libsystem_pthread.dylib 0xaac start_wqthread + 8 We found an issue similar to us - https://developer.apple.com/forums/thread/770771. But the crash logs are quite different, we believe this warrants further investigation to better understand the root cause and potential mitigation strategies. Please let us know if any additional information would help diagnose this issue.
Replies
3
Boosts
0
Views
422
Activity
Jul ’25
Core ML .mlpackage not found in bundle despite target membership and Copy Bundle Resources
Hi everyone, I’m working on an iOS app that uses a Core ML model to run live image recognition. I’ve run into a persistent issue with the mlpackage not being turned into a swift class. This following error is in the code, and in carDetection.mlpackage, it says that model class has not been generated yet. The error in the code is as follows: What I’ve tried: Verified Target Membership is checked for carDetectionModel.mlpackage Confirmed the file is listed under Copy Bundle Resources (and removed from Compile Sources) Cleaned the build folder (Shift + Cmd + K) and rebuilt Renamed and re-added the .mlpackage file Restarted Xcode and re-added the file Logged bundle contents at runtime, but the .mlpackage still doesn’t appear The mlpackage is in Copy bundle resources, and is not in the compile sources. I just don't know why a swift class is not being generated for the mlpackage. Could someone please give me some guidance on what to do to resolve this issue? Sorry if my error is a bit naive, I'm pretty new to iOS app development
Replies
3
Boosts
0
Views
583
Activity
Dec ’25
Compatibility issue of TensorFlow-metal with PyArrow
Overview I'm experiencing a critical issue where TensorFlow-metal and PyArrow seem to be incompatible when installed together in the same environment. Whenever both packages are present, TensorFlow crashes and the kernel dies during execution. Environment Details Environment Details macOS Version: 15.3.2 Mac Model: MacBook Pro Max M3 Python Version: 3.11 TensorFlow Version: 2.19 PyArrow Version: 19.0.0 Issue Description: When both TensorFlow-metal and PyArrow are installed in the same Python environment, any attempt to use TensorFlow results in immediate kernel crashes. The issue appears to be a compatibility problem between these two packages rather than a problem with either package individually. Steps to Reproduce Create a new Python environment: conda create -n tf-metal python=3.11 Install TensorFlow-metal: pip install tensorflow tensorflow-metal Install PyArrow: pip install pyarrow Run the following minimal example: # Create a simple model model = tf.keras.Sequential([ tf.keras.layers.Input(shape=(2,)), tf.keras.layers.Dense(1) ]) model.compile(optimizer='adam', loss='mse') model.summary() # This works fine # Generate some dummy data X = np.random.random((100, 2)) y = np.random.random((100, 1)) # The crash happens exactly at this line model.fit(X, y, epochs=5, batch_size=32) # CRASH: Kernel dies here Result: Kernel crashes with no error message What I've Tried Reinstalling both packages in different orders Using different versions of both packages Creating isolated environments Checking system logs for additional error information The only workaround I've found is to use separate environments for each package, which isn't practical for my workflow as I need both libraries for my data processing and machine learning pipeline. Questions Has anyone else encountered this specific compatibility issue? Are there known workarounds that allow both packages to coexist? Is this a known issue that's being addressed in upcoming releases? Any insights, suggestions, or assistance would be greatly appreciated. I'm happy to provide any additional information that might help diagnose this problem. Thank you in advance for your help! Thank you in advance for your help!
Replies
2
Boosts
0
Views
143
Activity
May ’25
A specific mlmodelc model runs on iPhone 15, but not on iPhone 16
As we described on the title, the model that I have built completely works on iPhone 15 / A16 Bionic, on the other hand it does not run on iPhone 16 / A18 chip with the following error message. E5RT encountered an STL exception. msg = MILCompilerForANE error: failed to compile ANE model using ANEF. Error=_ANECompiler : ANECCompile() FAILED. E5RT: MILCompilerForANE error: failed to compile ANE model using ANEF. Error=_ANECompiler : ANECCompile() FAILED (11) It consumes 1.5 ~ 1.6 GB RAM on the loading the model, then the consumption is decreased to less than 100MB on the both of iPhone 15 and 16. After that, only on iPhone 16, the above error is shown on the Xcode log, the memory consumption is surged to 5 to 6GB, and the system kills the app. It works well only on iPhone 15. This model is built with the Core ML tools. Until now, I have tried the target iOS 16 to 18 and the compute units of CPU_AND_NE and ALL. But any ways have not solved this issue. Eventually, what kindof fix should I do? minimum_deployment_target = ct.target.iOS18 compute_units = ct.ComputeUnit.ALL compute_precision = ct.precision.FLOAT16
Replies
2
Boosts
0
Views
228
Activity
May ’25
Is there an API to check if a Core ML compiled model is already cached?
Hello Apple Developer Community, I'm investigating Core ML model loading behavior and noticed that even when the compiled model path remains unchanged after an APP update, the first run still triggers an "uncached load" process. This seems to impact user experience with unnecessary delays. Question: Does Core ML provide any public API to check whether a compiled model (from a specific .mlmodelc path) is already cached in the system? If such API exists, we'd like to use it for pre-loading decision logic - only perform background pre-load when the model isn't cached. Has anyone encountered similar scenarios or found official solutions? Any insights would be greatly appreciated!
Replies
2
Boosts
0
Views
255
Activity
May ’25
Memory stride warning when loading CoreML models on ANE
When I am doing an uncached load of CoreML model on ANE, I received this warning in Xcode console Type of hiddenStates in function main's I/O contains unknown strides. Using unknown strides for MIL tensor buffers with unknown shapes is not recommended in E5ML. Please use row_alignment_in_bytes property instead. Refer to https://e5-ml.apple.com/more-info/memory-layouts.html for more information. However, the web link does not seem to be working. Where can I find more information about about this and how can I fix it?
Replies
2
Boosts
0
Views
642
Activity
1w
Core ML model decryption on Intel chips
About the Core ML model encryption mention in:https://developer.apple.com/documentation/coreml/encrypting-a-model-in-your-app When I encrypted the model, if the machine is M chip, the model will load perfectly. One the other hand, when I test the executable on an Intel chip macbook, there will be an error: Error Domain=com.apple.CoreML Code=9 "Operation not supported on this platform." UserInfo={NSLocalizedDescription=Operation not supported on this platform.} Intel test machine is 2019 macbook air with CPU: Intel i5-8210Y, OS: 14.7.6 23H626, With Apple T2 Security Chip. The encrypted model do load on M2 and M4 macbook air. If the model is NOT encrypted, it will also load on the Intel test machine. I did not find in Core ML document that suggest if the encryption/decryption support Intel chips. May I check if the decryption indeed does NOT support Intel chip?
Replies
2
Boosts
1
Views
423
Activity
Jan ’26
CoreML Inference Acceleration
Hello everyone, I have a visual convolutional model and a video that has been decoded into many frames. When I perform inference on each frame in a loop, the speed is a bit slow. So, I started 4 threads, each running inference simultaneously, but I found that the speed is the same as serial inference, every single forward inference is slower. I used the mactop tool to check the GPU utilization, and it was only around 20%. Is this normal? How can I accelerate it?
Replies
2
Boosts
0
Views
710
Activity
Sep ’25
Unable to load a quantized Qwen 1.7B model on an iPhone SE 3
I am trying to benchmark and see if the Qwen3 1.7B model can run in an iPhone SE 3 [4 GB RAM]. My core problem is - Even with weight quantization the SE 3 is not able to load into memory. What I've tried: I am converting a Torch model to the Core ML format using coremltools. I have tried the following combinations of quantization and context length 8 bit + 1024 8 bit + 2048 4 bit + 1024 4 bit + 2048 All the above quantizations are done with dynamic shape with the default being [1,1] in the hope that the whole context length does not get allocated in memory The 4-bit model is approximately 865MB on disk The 8-bit model is approximately 1.7 GB on disk During load: With the int4 quantization the memory spikes during intitial load a lot. Could this be because many operations are converted to int8 or fp16 as core ML does not perform operations natively on int4? With int8 on the profiler the memory does not go above 2 GB (only 900 MB) but it is still not able to load as it shows the following error. 2GB is the limit where jetsam kills the app for the iPhone SE 3 E5RT: Error(s) occurred compiling MIL to BNNS graph: [CreateBnnsGraphProgramFromMIL]: BNNS Graph Compile: failed to preallocate file with error: No space left on device for path: /var/mobile/Containers/Data/Application/ 5B8BB7D2-06A6-4BAE-A042-407B6D805E7C/Library/Caches /com.tss.qwen3-coreml/ com.apple.e5rt.e5bundlecache/ 23A341/<long key>.tmp.12586_4362093968.bundle/ H14.bundle/main/main_bnns/bnns_program.bnnsir Some online sources have suggested activation quantization but I am unsure if that will have any impact on loading [as the spike is during load and not inference] The model spec also suggests that there is no dequantization happening (for e.g from 4 bit -> fp16) So I had couple of queries: Has anyone faced similar issues? What could be the reasons for the temporary memory spike during LOAD What are approaches that can be adopted to deal with this issue? Any help would be greatly appreciated. Thank you.
Replies
2
Boosts
0
Views
235
Activity
Mar ’26
How can I change the output dimensions of a CoreML model in Xcode when the outputs come from a NonMaximumSuppression layer?
After exerting a custom model with nms=True. In Xcode, the outputs show as: confidence: MultiArray (0 × 5) coordinates: MultiArray (0 × 4) I want to set fixed shapes (e.g., 100 × 5, 100 × 4), but Xcode does not allow editing—the shape fields are locked. The model graph shows both outputs come directly from a NonMaximumSuppression layer. Is it possible to set fixed output dimensions for NMS outputs in CoreML?
Replies
2
Boosts
0
Views
236
Activity
Mar ’26
tensorflow-metal ReLU activation fails to clip negative values on M4 Apple Silicon
Environment: Hardware: Mac M4 OS: macOS Sequoia 15.7.4 TensorFlow-macOS Version: 2.16.2 TensorFlow-metal Version: 1.2.0 Description: When using the tensorflow-metal plug-in for GPU acceleration on M4, the ReLU activation function (both as a layer and as an activation argument) fails to correctly clip negative values to zero. The same code works correctly when forced to run on the CPU. Reproduction Script: import os import numpy as np import tensorflow as tf # weights and biases = -1 weights = [np.ones((10, 5)) * -1, np.ones(5) * -1] # input = 1 data = np.ones((1, 10)) # comment this line => GPU => get negative values # uncomment this line => CPU => no negative values # tf.config.set_visible_devices([], 'GPU') # create model model = tf.keras.Sequential([ tf.keras.layers.Input(shape=(10,)), tf.keras.layers.Dense(5, activation='relu') ]) # set weights model.layers[0].set_weights(weights) # get output output = model.predict(data) # check if negative is present print(f"min value: {output.min()}") print(f"is negative present? {np.any(output < 0)}")
Replies
2
Boosts
0
Views
416
Activity
3w