@property (assign,nonatomic) long long experimentalMLE5EngineUsage; //@synthesize experimentalMLE5EngineUsage=_experimentalMLE5EngineUsage - In the implementation block
What is it, and why would disabling it fix NMS for a MLProgram?
Is there anyway to signal this flag from model metadata? Is there anyway to signal or disable from a global, system-level scope?
It's extremely easy to reproduce, but do not know how to investigate the drastic regression between toggling this flag
let config = MLModelConfiguration()
config.setValue(1, forKey: "experimentalMLE5EngineUsage")
Core ML
RSS for tagIntegrate machine learning models into your app using Core ML.
I have a question regarding hybrid execution for deep learning models on Apple's Neural Engine and CPU. I am aware that setting the precision of some layers to 32-bit allows hybrid execution across both the Neural Engine and the CPU. However, I would like to know if it is possible to achieve the same with 16-bit precision.
Is there any specific configuration or workaround to enable hybrid execution in this case? Any guidance or documentation references would be greatly appreciated.
Thank you!
Hi everyone,
I'm working on an iOS app that uses VisionKit and I'm exploring the .visualLookUp feature. Specifically, I want to extract the detailed information that Visual Look Up provides after identifying an object in an image (e.g., if the object is a flower, retrieve its name; if it’s a clothing tag, get the tag's content).
The Core ML developer guide recommends saving reusable compiled Core ML models to a permanent location to avoid unnecessary rebuilds when creating a Core ML model instance.
However, there is no location that remains consistent across app updates, since each update changes the UUID associated with the app’s resources path
/var/mobile/Containers/Data/Application/<UUID>/Library/Application Support/
As a result, Core ML rebuilds models even if they are unchanged and located in the same relative directory within the app’s file structure.
I am trying to create a Pipeline with 3 sub-models: a Feature Vectorizer -> a NN regressor converted from PyTorch -> a Feature Extractor (to convert the output tensor to a Double value).
The pipeline works fine when I use just a Vectorizer and an Extractor, this is the code:
vectorizer = models.feature_vectorizer.create_feature_vectorizer(
input_features=["windSpeed", "theoreticalPowerCurve", "windDirection"], # Multiple input features
preProc_spec = vectorizer[0]
extractor = models.array_feature_extractor.create_array_feature_extractor(
input_features=[("input",datatypes.Array(3,))], # Multiple input features
extract_indices = 1
pipeline_network = pipeline.PipelineRegressor (
input_features = ["windSpeed", "theoreticalPowerCurve", "windDirection"],
This model works ok. I created a regression NN using PyTorch and converted to Core ML either
import torch
import torch.nn as nn
class TurbinePowerModel(nn.Module):
def __init__(self):
self.linear1 = nn.Linear(3, 4)
self.activation1 = nn.ReLU()
#self.linear2 = nn.Linear(5, 4)
#self.activation2 = nn.ReLU()
self.output = nn.Linear(4, 1)
def forward(self, x):
#x = F.normalize(x, dim = 0)
x = self.linear1(x)
x = self.activation1(x)
# x = self.linear2(x)
# x = self.activation2(x)
x = self.output(x)
return x
def forward_inference(self, windSpeed,theoreticalPowerCurve,windDirection):
input_tensor = torch.tensor([windSpeed,
windDirection], dtype=torch.float32)
return self.forward(input_tensor)
model = torch.load('TurbinePowerRegression-1layer.pt', weights_only=False)
import coremltools as ct
import pandas as pd
from sklearn.preprocessing import StandardScaler
df = pd.read_csv('T1_clean.csv',delimiter=';')
X = df[['WindSpeed','TheoreticalPowerCurve','WindDirection']]
y = df[['ActivePower']]
scaler = StandardScaler()
X = scaler.fit_transform(X)
y = scaler.fit_transform(y)
X_tensor = torch.tensor(X, dtype=torch.float32)
y_tensor = torch.tensor(y, dtype=torch.float32)
traced_model = torch.jit.trace(model, X_tensor[0])
mlmodel = ct.convert(
inputs=[ct.TensorType(name="input", shape=X_tensor[0].shape)],
classifier_config=None # Optional, for classification tasks
This model has a Multiarray(Float 32 3) as input and a Multiarray(Float32 1) as output.
When I try to include it in the middle of the pipeline (Adjusting the output and input types of the other models accordingly), the process runs ok, but I have the following error when opening the generated model on Xcode:
What's is missing on the models. How can I set or adjust this metadata properly?
I have set SWIFT_UPCOMING_FEATURE_EXISTENTIAL_ANY at Build Settings > Swift Compiler - Upcoming Features to true to support this existential any proposal.
Then following errors appears in the MLModel class, but this is an auto-generated file, so I don't know how to deal with it.
Use of protocol 'MLFeatureProvider' as a type must be written 'any MLFeatureProvider'
Use of protocol 'Error' as a type must be written 'any Error'
Xcode 16.0
Xcode 16.1 Beta 2
What I tried
Delete cache of DerivedData and regenerate MLModel class files
I also tried using DepthAnythingV2SmallF16P6.mlpackage to verify if there is a problem with my mlmodel
I tried the above after setting up Swift6 in Xcode
I also used coremlc to generate MLModel class files with Swift6 specified by command.
I use SoundAnalysis to analyze background sounds and have enabled background permissions. It worked well in previous iOS systems, but a warning appeared in the new iOS18beta version and sound analysis was stopped.
Warning List:
Execution of the command buffer was aborted due to an error during execution. Insufficient Permission (to submit GPU work from background)
[Espresso::handle_ex_plan] exception=Espresso exception: "Generic error": Insufficient Permission (to submit GPU work from background) (00000006:kIOGPUCommandBufferCallbackErrorBackgroundExecutionNotPermitted); code=7 status=-1
Unable to compute the prediction using a neural network model. It can be an invalid input data or broken/unsupported model (error code: -1).
CoreML prediction failed with Error Domain=com.apple.CoreML Code=0 "Failed to evaluate model 0 in pipeline" UserInfo={NSLocalizedDescription=Failed to evaluate model 0 in pipeline, NSUnderlyingError=0x30330e910 {Error Domain=com.apple.CoreML Code=0 "Failed to evaluate model 1 in pipeline" UserInfo={NSLocalizedDescription=Failed to evaluate model 1 in pipeline, NSUnderlyingError=0x303307840 {Error Domain=com.apple.CoreML Code=0 "Unable to compute the prediction using a neural network model. It can be an invalid input data or broken/unsupported model (error code: -1)." UserInfo={NSLocalizedDescription=Unable to compute the prediction using a neural network model. It can be an invalid input data or broken/unsupported model (error code: -1).}}}}}
In our app we use CoreML. But ever since macOS 15.x was released we started to get a great bunch of crashes like this:
Incident Identifier: 424041c3-884b-4e50-bb5a-429a83c3e1c8
CrashReporter Key: B914246B-1291-4D44-984D-EDF84B52310E
Hardware Model: Mac14,12
Process: <REMOVED> [1509]
Path: /Applications/<REMOVED>
Identifier: com.<REMOVED>
Version: <REMOVED>
Code Type: arm64
Parent Process: launchd [1]
Date/Time: 2024-11-13T13:23:06.999Z
Launch Time: 2024-11-13T13:22:19Z
OS Version: Mac OS X 15.1.0 (24B83)
Report Version: 104
Exception Type: SIGABRT
Exception Codes: #0 at 0x189042600
Crashed Thread: 36
Thread 36 Crashed:
0 libsystem_kernel.dylib 0x0000000189042600 __pthread_kill + 8
1 libsystem_c.dylib 0x0000000188f87908 abort + 124
2 libsystem_c.dylib 0x0000000188f86c1c __assert_rtn + 280
3 Metal 0x0000000193fdd870 MTLReportFailure.cold.1 + 44
4 Metal 0x0000000193fb9198 MTLReportFailure + 444
5 MetalPerformanceShadersGraph 0x0000000222f78c80 -[MPSGraphExecutable initWithMPSGraphPackageAtURL:compilationDescriptor:] + 296
6 Espresso 0x00000001a290ae3c E5RT::SharedResourceFactory::GetMPSGraphExecutable(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, NSDictionary*) + 932
43 CoreML 0x0000000192d263bc -[MLModelAsset modelWithConfiguration:error:] + 120
44 CoreML 0x0000000192da96d0 +[MLModel modelWithContentsOfURL:configuration:error:] + 176
45 <REMOVED> 0x000000010497b758 -[<REMOVED> <REMOVED>] (<REMOVED>)
No similar crashes on macOS 12-14!
Any clue what is causing this?
Thanks! :)
I used the multifunction models feature introduced in iOS 18 to merge three VAE Encoder models with different resolutions into a single model. However, loading this merged model on iOS causes a crash with the error EXC_BAD_ACCESS (code=1, address=0x0). In contrast, merging VAE Decoder models using the same method does not result in crashes. Additionally, merging only two VAE Decoder models with different resolutions also leads to a crash when loaded on iOS. As for the Stable Diffusion Unet model, merging two or even three models does not cause any crashes, and it successfully generates images as expected.
I use the following code to load the model:
let config = MLModelConfiguration()
config.computeUnits = .cpuAndNeuralEngine
config.functionName = "test"
try MLModel(contentsOf: url, configuration: config)
I am using the depthAnything v2 provided by Apple on the developer website. On my iPhone 15 Pro, if I choose all or cpuAndNeuralEngine, it will stuck in loading models.
let config = MLModelConfiguration()
config.computeUnits = .cpuAndGPU//normal when not using neuralEngine.
let model = try await DepthModel.load(configuration: config)
with following error:
E5RT encountered an STL exception. msg = MILCompilerForANE error: failed to compile ANE model using ANEF. Error=无法与帮助程序通信。.
E5RT: MILCompilerForANE error: failed to compile ANE model using ANEF. Error=无法与帮助程序通信。 (11)
We use MLModel in our app, which uses two file formats: mlmodel and mlpackage. We find that when the model is released, models using mlmodel format have a certain probability of crashing. And these crashes account for the majority (over 85%) in the iOS 16.x system. Here is the crash stack:
Exception Type: SIGTRAP
Exception Codes: TRAP_BRKPT at 0x1b48e855c
Crashed Thread: 5
Thread 5 Crashed:
0 libdispatch.dylib 0x00000001b48e855c _dispatch_semaphore_dispose.cold.1 + 40
1 libdispatch.dylib 0x00000001b48b2b28 _dispatch_semaphore_signal_slow
2 libdispatch.dylib 0x00000001b48b0e58 _dispatch_dispose + 208
3 AppleNeuralEngine 0x00000001ef07b51c -[_ANEProgramForEvaluation .cxx_destruct] + 32
4 libobjc.A.dylib 0x00000001a67ed4a4 object_cxxDestructFromClass(objc_object*, objc_class*) + 116
5 libobjc.A.dylib 0x00000001a67f221c objc_destructInstance + 80
6 libobjc.A.dylib 0x00000001a67fb9d0 _objc_rootDealloc + 80
7 AppleNeuralEngine 0x00000001ef079e04 -[_ANEProgramForEvaluation dealloc] + 72
8 AppleNeuralEngine 0x00000001ef07ca70 -[_ANEModel .cxx_destruct] + 44
9 libobjc.A.dylib 0x00000001a67ed4a4 object_cxxDestructFromClass(objc_object*, objc_class*) + 116
10 libobjc.A.dylib 0x00000001a67f221c objc_destructInstance + 80
11 libobjc.A.dylib 0x00000001a67fb9d0 _objc_rootDealloc + 80
12 AppleNeuralEngine 0x00000001ef07bd7c -[_ANEModel dealloc] + 136
13 CoreFoundation 0x00000001ad4563cc cow_cleanup + 168
14 CoreFoundation 0x00000001ad49044c -[__NSDictionaryM dealloc] + 148
15 Espresso 0x00000001bb19c7a4 Espresso::ANERuntimeEngine::compiler::reset() + 1340
16 Espresso 0x00000001bb19cac8 Espresso::ANERuntimeEngine::compiler::~compiler() + 108
17 Espresso 0x00000001bacd69e4 std::__1::__shared_weak_count::__release_shared() + 84
18 Espresso 0x00000001ba944d00 std::__1::__hash_table<std::__1::__hash_value_type<Espresso::platform, std::__1::shared_ptr<Espresso::net_compiler>>, std::__1::__unordered_map_hasher<Espresso::platform, std::__1::__hash_value_type<Espresso::platform, std::__1::shared_ptr<Espresso::net_compiler>>, std::__1::hash<Espresso::platform>, std::__1::equal_to<Espresso::platform>, true>, std::__1::__unordered_map_equal<Espresso::platform, std::__1::__hash_value_type<Espresso::platform, std::__1::shared_ptr<Espresso::net_compiler>>, std::__1::equal_to<Espresso::platform>, std::__1::hash<Espresso::platform>, true>, std::__1::allocator<std::__1::__hash_value_type<Espresso::platform, std::__1::shared_ptr<Espresso::net_compiler>>>>::__deallocate_node(std::__1::__hash_node_base<std::__1::__hash_node<std::__1::__hash_value_type<Espresso::platform, std::__1::shared_ptr<Espresso::net_compiler>>, void*>*>*) + 40
19 Espresso 0x00000001ba8ea640 std::__1::__hash_table<std::__1::__hash_value_type<Espresso::platform, std::__1::shared_ptr<Espresso::net_compiler>>, std::__1::__unordered_map_hasher<Espresso::platform, std::__1::__hash_value_type<Espresso::platform, std::__1::shared_ptr<Espresso::net_compiler>>, std::__1::hash<Espresso::platform>, std::__1::equal_to<Espresso::platform>, true>, std::__1::__unordered_map_equal<Espresso::platform, std::__1::__hash_value_type<Espresso::platform, std::__1::shared_ptr<Espresso::net_compiler>>, std::__1::equal_to<Espresso::platform>, std::__1::hash<Espresso::platform>, true>, std::__1::allocator<std::__1::__hash_value_type<Espresso::platform, std::__1::shared_ptr<Espresso::net_compiler>>>>::~__hash_table() + 28
20 Espresso 0x00000001ba8e5750 Espresso::net::~net() + 396
21 Espresso 0x00000001bacd69e4 std::__1::__shared_weak_count::__release_shared() + 84
22 Espresso 0x00000001bad750e4 std::__1::__vector_base<std::__1::shared_ptr<Espresso::net>, std::__1::allocator<std::__1::shared_ptr<Espresso::net>>>::clear() + 52
23 Espresso 0x00000001ba902448 std::__1::__vector_base<std::__1::shared_ptr<Espresso::net>, std::__1::allocator<std::__1::shared_ptr<Espresso::net>>>::~__vector_base() + 36
24 Espresso 0x00000001ba8ed99c std::__1::unique_ptr<EspressoLight::espresso_plan::priv_t, std::__1::default_delete<EspressoLight::espresso_plan::priv_t>>::reset(EspressoLight::espresso_plan::priv_t*) + 188
25 Espresso 0x00000001ba95b7fc EspressoLight::espresso_plan::~espresso_plan() + 72
26 Espresso 0x00000001ba902078 EspressoLight::espresso_plan::~espresso_plan() + 16
27 Espresso 0x00000001ba8e690c espresso_plan_destroy + 372
28 CoreML 0x00000001c48c45cc -[MLNeuralNetworkEngine _deallocContextAndPlan] + 40
29 CoreML 0x00000001c48c43bc -[MLNeuralNetworkEngine dealloc] + 40
30 libobjc.A.dylib 0x00000001a67ed4a4 object_cxxDestructFromClass(objc_object*, objc_class*) + 116
31 libobjc.A.dylib 0x00000001a67f221c objc_destructInstance + 80
32 libobjc.A.dylib 0x00000001a67fb9d0 _objc_rootDealloc + 80
~~~~ Our code that release the MLModel object ~~~~
Moreover, we use a synchronization mechanism to ensure that the release of the MLModel and the data processing of the model (by calling [model predictionFromFeatures]) do not occur simultaneously. What could be the possible causes of the problem, and how can we prevent it from happening? Any advice would be appreciated.
I've made the FastAI's Cat vs Dog model into model that distinguishes lemons from limes and it all works fine in a notebook.
I am now looking to transform this model into Core ML for my iOS app using TorchScript and Apple official guidelines for coremltools.
Model converts but I cannot see the Preview Tab in. Xcode. Have anyone of you tried to convert to Core ML? I guess my input types are not matching with coremltools expectations for preview but I am stuck . Here is my code.
import torch
import coremltools as ct
from fastai.vision.all import *
import json
from torchvision import transforms
# Load your Fastai model (replace with your actual path)
learn = load_learner('lemonmodel.pkl')
# Example input image (you can use any image from your dataset)
input_image = PILImage.create('example.jpg')
# Preprocess the image (assuming you used these transforms during training)
to_tensor = transforms.ToTensor()
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
input_tensor = to_tensor(input_image)
input_tensor = normalize(input_tensor) # Apply normalization
# Add a batch dimension
input_tensor = input_tensor.unsqueeze(0)
# Ensure float32 type
input_tensor = input_tensor.float()
# Trace the model
trace = torch.jit.trace(learn.model, input_tensor)
# Define the Core ML input type (considering your model's input shape)
_input = ct.ImageType(
bias=[-0.485/0.229, -0.456/0.224, -0.406/0.225],
# Convert the model to Core ML format
mlmodel = ct.convert(
minimum_deployment_target=ct.target.iOS14 # Optional, set deployment target
# Set model type as 'imageClassifier' for the Preview tab
mlmodel.type = 'imageClassifier'
# Correct structure for preview parameters** (assuming two classes: 'lemon' and 'lime')
labels_json = {
"imageClassifier": {
"labels": ["lemon", "lime"],
"input": {
"shape": list(input_tensor.shape), # Provide the actual input shape
"mean": [0.485, 0.456, 0.406], # Match normalization mean
"std": [0.229, 0.224, 0.225] # Match normalization std
"output": {
"shape": [1, 2] # Output shape for your model (2 classes)
# Setting up the metadata with correct 'preview' params
mlmodel.user_defined_metadata['com.apple.coreml.model.preview.params'] = json.dumps(labels_json)
# Save the model as .mlmodel
mlmodel = ct.convert(
minimum_deployment_target=ct.target.iOS14 # Optional, set deployment target
# Set model type as 'imageClassifier' for the Preview tab**
mlmodel.type = 'imageClassifier'
# Correct structure for preview parameters** (assuming two classes: 'lemon' and 'lime')
labels_json = {
"imageClassifier": {
"labels": ["lemon", "lime"],
"input": {
"shape": list(input_tensor.shape), # Provide the actual input shape
"mean": [0.485, 0.456, 0.406], # Match normalization mean
"std": [0.229, 0.224, 0.225] # Match normalization std
"output": {
"shape": [1, 2] # Output shape for your model (2 classes)
# Setting up the metadata with correct 'preview' params**
mlmodel.user_defined_metadata['com.apple.coreml.model.preview.params'] = json.dumps(labels_json)
# Save the model as .mlmodel
My model is :
Input batch shape: torch.Size([32, 3, 192, 192])
Labels batch shape: torch.Size([32])
Validation Loss: None, Validation Metric: None
Predictions shape: torch.Size([63, 2])
Targets shape: torch.Size([63])
Code for the model :
searches = 'lemon','lime'
path = Path('lemon_or_not')
for o in searches:
dest = (path/o)
dest.mkdir(exist_ok=True, parents=True)
download_images(dest, urls=search_images(f'{o} photo'))
resize_images(path/o, max_size=400, dest=path/o)
dls = DataBlock(
blocks=(ImageBlock, CategoryBlock),
splitter=RandomSplitter(valid_pct=0.2, seed=42),
item_tfms=[Resize(192, method='squish')]
).dataloaders(path, bs=32)
learn = vision_learner(dls, resnet18, metrics=error_rate)
is_lemon,_,probs = learn.predict(PILImage.create('lemon.jpg'))
print(f"This is a: {is_lemon}.")
print(f"Probability it's a lemon: {probs[0]:.4f}")
This is a: lemon.
Probability it's a lemon: 1.0000
I am stuck to why it doest show the Preview Tab.
I'm working on a cross-platform AI app. It is a CMake project. The inference part should be built as a library separately on Windows and MacOS. On MacOS it should be built with objective-c and CoreML.
Here's my step roughly:
Create a XCode Project for CoreML inference and build it as static library. Models are compiled to ".mlmodelc", and codes are compile to binary ".a" lib.
Create a CMake Project for the app, and use the ".a" lib built by XCode.
Run the App.
I initialize the CoreML model like this(just for demostration):
#include "det.h" // the model header generated by xcode
auto url = [[NSURL alloc] initFileURLWithPath:[NSString stringWithFormat:@"%@/%@", dir, @"det.mlmodelc"]];
auto model = [[det alloc] initWithContentsOfURL:url error:&error]; // no error
The url is valid, and the initialization doesn't report any error. However, when I tried to do inference using codes like this:
auto cvPixelBuffer = createCVPixelBuffer(960, 960); // util function
auto preds = [model predictionFromImage:cvPixelBuffer error:NULL];
The output preds will be null and I got these errors:
2024-12-10 14:52:37.678201+0800 望言OCR[50204:5615023] [e5rt] E5RT encountered unknown exception.
2024-12-10 14:52:37.678237+0800 望言OCR[50204:5615023] [coreml] E5RT: E5RT encountered an unknown exception. (11)
2024-12-10 14:52:37.870739+0800 望言OCR[50204:5615023] H11ANEDevice::H11ANEDeviceOpen kH11ANEUserClientCommand_DeviceOpen call failed result=0xe00002e2
2024-12-10 14:52:37.870758+0800 望言OCR[50204:5615023] Device Open failed - status=0xe00002e2
2024-12-10 14:52:37.870760+0800 望言OCR[50204:5615023] (Single-ANE System) Critical Error: Could not open the only H11ANE device
2024-12-10 14:52:37.870769+0800 望言OCR[50204:5615023] H11ANEDeviceOpen failed: 0x17
2024-12-10 14:52:37.870845+0800 望言OCR[50204:5615023] H11ANEDevice::H11ANEDeviceOpen kH11ANEUserClientCommand_DeviceOpen call failed result=0xe00002e2
2024-12-10 14:52:37.870848+0800 望言OCR[50204:5615023] Device Open failed - status=0xe00002e2
2024-12-10 14:52:37.870849+0800 望言OCR[50204:5615023] (Single-ANE System) Critical Error: Could not open the only H11ANE device
2024-12-10 14:52:37.870853+0800 望言OCR[50204:5615023] H11ANEDeviceOpen failed: 0x17
2024-12-10 14:52:37.870857+0800 望言OCR[50204:5615023] [common] start: ANEDeviceOpen() failed : ret=23 :
It seems that CoreML failed to find ANE device. Is there anything need to be done before we use a CoreML Model as a library in a CMake or other non-XCode project?
By the way, codes like above will work on an XCode Native App with CoreML (I tested this before) . So I guess I missed some environment initializations in my non-XCode project?
as showed in the course I created the PyTorch model sample and want to export / convert this model o a CoreML iOS Model using the coremltools. Input is a 224x224 image and output is a image classification (3 different classes)
I am using coremltools for this with this code:
import coremltools as ct
modelml = ct.convert(
I have a working iOS App code which performs with another model which was created using Microsoft Azure Vision.
The PyTorch exported model is loaded and a prediction is performed, but I am getting this error:
Foundation.MonoTouchException: Objective-C exception thrown. Name: NSInvalidArgumentException Reason: -[VNCoreMLFeatureValueObservation identifier]: unrecognized selector sent to instance 0x2805dd3b0
When I check the exported model with Xcode and compare it with another model which is working with the sample iOS App code (created and exported from Microsoft Azure) I can see that the input (for image classification using the device camera) seems ok and is equal, but the output is totally different. (see screenshots)
The working model has two outputs:
loss => Dictionary (String => Double)
classLabel => String
My exported model using coremltools just has one export:
MultiArray(Float32) (name var_1620, I think this is the last feature layer output of the EfficentNetB2)
How do I change my model or my coremltools export to get the correct output for the prediction ?
I read the coreml documentation (https://coremltools.readme.io/docs/pytorch-conversion) and tried some GitHub samples.
But I never get the correct output.
How do I export the PyTorch model so that the output is correct and the prediction will work ?
My app was rejected because of this error below but I cannot find any documentation on a key related to Image Playground. My app is set to minimum of 18.2 already.
Rejection Message:
The UIRequiredDeviceCapabilities key in the Info.plist is set in such a way that the app will not install on iPhone running iOS 18.1.1
Next Steps
To resolve this issue, check the UIRequiredDeviceCapabilities key to verify that it contains only the attributes required for the app features or the attributes that must not be present on the device. Attributes specified by a dictionary should be set to true if they are required and false if they must not be present on the device.
Learn more about the UIRequiredDeviceCapabilities key.
I am exploring real-time object detection, and its replacement/overlay with another shape, on live video streams for an iOS app using Core ML and Vision frameworks. My target is to achieve high-speed, real-time detection without noticeable latency, similar to what’s possible with PageFault handling and Associative Caching in OS, but applied to video processing.
Given that this requires consistent, real-time model inference, I’m curious about how well the Neural Engine or GPU can handle such tasks on A-series chips in iPhones versus M-series chips (specifically M1 Pro and possibly M4) in MacBooks. Here are a few specific points I’d like insight on:
Hardware Suitability: How feasible is it to perform real-time object detection with Core ML on the Neural Engine (i.e., can it maintain low latency)? Would the M-series chips (e.g., M1 Pro or newer) offer a tangible benefit for this type of task compared to the A-series in mobile devices? Which A- and M- chips would be minimum feasible recommendation for such task.
Performance Expectations: For continuous, live video object detection, what would be the expected frame rate or latency using an optimized Core ML model? Has anyone benchmarked such applications, and is the M-series required to achieve smooth, real-time processing?
Differences Across Apple Hardware: How does performance scale between the A-series Neural Engine and M-series GPU and Neural Engine? Is the M-series vastly superior for real-time Core ML tasks like object detection on live video feeds?
If anyone has attempted live object detection on these chips, any insights on real-time performance, limitations, or optimizations would be highly appreciated.
Please refer: Apple APIs
Thank you in advance for your help!
When building MLModel, it is set to use NPU. It seems that GPU is used during inference, but it crashes during Compile.
The stack is as follows:
We are experiencing a major issue with the native .version1 of the SoundAnalysis framework in iOS 18, which has led to all our user not having recordings. Our core feature relies heavily on sound analysis in the background, and it previously worked flawlessly in prior iOS versions. However, in the new iOS 18, sound analysis stops working in the background, triggering a critical warning.
Details of the issue:
We are using SoundAnalysis to analyze background sounds and have enabled the necessary background permissions.
We are using the latest XCode
A warning now appears, and sound analysis fails in the background. Below is the warning message we are encountering:
Warning Message:
Execution of the command buffer was aborted due to an error during execution. Insufficient Permission (to submit GPU work from background)
[Espresso::handle_ex_plan] exception=Espresso exception: "Generic error": Insufficient Permission (to submit GPU work from background) (00000006:kIOGPUCommandBufferCallbackErrorBackgroundExecutionNotPermitted); code=7 status=-1
Unable to compute the prediction using a neural network model. It can be an invalid input data or broken/unsupported model (error code: -1).
CoreML prediction failed with Error Domain=com.apple.CoreML Code=0 "Failed to evaluate model 0 in pipeline" UserInfo={NSLocalizedDescription=Failed to evaluate model 0 in pipeline, NSUnderlyingError=0x30330e910 {Error Domain=com.apple.CoreML Code=0 "Failed to evaluate model 1 in pipeline" UserInfo={NSLocalizedDescription=Failed to evaluate model 1 in pipeline, NSUnderlyingError=0x303307840 {Error Domain=com.apple.CoreML Code=0 "Unable to compute the prediction using a neural network model. It can be an invalid input data or broken/unsupported model (error code: -1)." UserInfo={NSLocalizedDescription=Unable to compute the prediction using a neural network model. It can be an invalid input data or broken/unsupported model (error code: -1).}}}}}
We urgently need guidance or a fix for this, as our application’s main functionality is severely impacted by this background permission error. Please let us know the next steps or if this is a known issue with iOS 18.
I'm trying to set up Facebook AI's "Segment Anything" MLModel to compare its performance and efficacy on-device against the Vision library's Foreground Instance Mask Request.
The Vision request accepts any reasonably-sized image for processing, and then has a method to produce an output at the same resolution as the input image. Conversely, the MLModel for Segment Anything accepts a 1024x1024 image for inference and outputs a 1024x1024 image for output.
What is the best way to work with non-square images, such as 4:3 camera photos? I can basically think of 3 methods for accomplishing this:
Scale the image to 1024x1024, ignoring aspect ratio, then inversely scale the output back to the original size. However, I have a big concern that squashing the content will result in poor inference results.
Scale the image, preserving its aspect ratio so its minimum dimension is 1024, then run the model multiple times on a sliding 1024x1024 window and then aggregating the results. My main concern here is the complexity of de-duping the output, when each run could make different outputs based on how objects are cropped.
Fit the image within 1024x1024 and pad with black pixels to make a square. I'm not sure if the border will muck up the inference.
Anyway, this seems like it must be a well-solved problem in ML, but I'm having difficulty finding an authoritative best practice.
I'm finding the model is giving very jagged edges. This may be to do with the output resolution: Grayscale16Half 518 × 392.
I have tried to re-convert this model on Colab but have not had much luck as this is very much out of my comfort zone. Has anyone else dealt with this? the model would be perfect if I could just overcome this issue.