Hi All,
I am trying to build a new iOS app by following https://developer.apple.com/videos/play/wwdc2024/10163/?time=67
When I trying to remove all legacy VN I am getting error, I would appreciate if someone can help me get up to speed with the new Vision API
Core ML
RSS for tagIntegrate machine learning models into your app using Core ML.
Post
Replies
Boosts
Views
Activity
"On the latest iOS 18 beta 2, the OCR API,the Translate App and Live Text performs very poorly in recognizing Japanese."
I have several CoreML models that I've set up to run in sequence where one of the outputs from each model is passed as one of the inputs to the next.
For the most part, there is very little overhead in between each sub-model "chunk":
However a couple of the models (eg the first two above) spend a noticeable amount of time in "Prepare Neural Engine Request". From Instruments, it seems like this is spent doing some sort of model loading.
Given that I'm calling these models in sequence and in a fixed order, is there some way to reduce or amortize this cost? Thanks!
I was trying the latest coremltools-8.0b1 beta on macOS 15 Beta with the intent to try using the new stateful models api in CoreML.
But the conversion would always fail with the error:
/AppleInternal/Library/BuildRoots/<snip>/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphExecutable.mm:162: failed assertion `Error: the minimum deployment target for macOS is 14.0.0'
Here's a minimal repro, which works fine with both the stable version of coremltools (7.2) and the beta version (8.0b1) on macOS Sonoma 14.5, but fails with both versions of coremltools on macOS 15.0 Beta and Xcode 16.0 Beta. Which means that this most likely isn't an issue with coremltools, but with the native compilation toolchain.
from collections import OrderedDict
import coremltools as ct
import numpy as np
import torch
import torch.nn as nn
class ResidualAttentionBlock(nn.Module):
def __init__(self, d_model: int, n_head: int, attn_mask: torch.Tensor = None):
super().__init__()
self.attn = nn.MultiheadAttention(d_model, n_head)
self.ln_1 = nn.LayerNorm(d_model)
self.mlp = nn.Sequential(
OrderedDict(
[
("c_fc", nn.Linear(d_model, d_model * 4)),
("gelu", nn.GELU()),
("c_proj", nn.Linear(d_model * 4, d_model)),
]
)
)
self.ln_2 = nn.LayerNorm(d_model)
self.attn_mask = attn_mask
def attention(self, x: torch.Tensor):
self.attn_mask = (
self.attn_mask.to(dtype=x.dtype, device=x.device)
if self.attn_mask is not None
else None
)
return self.attn(x, x, x, need_weights=False, attn_mask=self.attn_mask)[0]
def forward(self, x: torch.Tensor):
x = x + self.attention(self.ln_1(x))
x = x + self.mlp(self.ln_2(x))
return x
class Transformer(nn.Module):
def __init__(
self, width: int, layers: int, heads: int, attn_mask: torch.Tensor = None
):
super().__init__()
self.width = width
self.layers = layers
self.resblocks = nn.Sequential(
*[ResidualAttentionBlock(width, heads, attn_mask) for _ in range(layers)]
)
def forward(self, x: torch.Tensor):
return self.resblocks(x)
transformer = Transformer(width=512, layers=12, heads=8)
emb_tokens = torch.rand((1, 512))
ct_model = ct.convert(
torch.jit.trace(transformer.eval(), emb_tokens),
convert_to="mlprogram",
minimum_deployment_target=ct.target.macOS14,
inputs=[ct.TensorType(name="embIn", shape=[1, 512])],
outputs=[ct.TensorType(name="embOutput", dtype=np.float32)],
)
I want to try an any resolution image input Core ML model.
So I wrote the model following the Core ML Tools "Set the Range for Each Dimensionas" sample code, modified as below:
# Trace the model with random input.
example_input = torch.rand(1, 3, 50, 50)
traced_model = torch.jit.trace(model.eval(), example_input)
# Set the input_shape to use RangeDim for each dimension.
input_shape = ct.Shape(shape=(1,
3,
ct.RangeDim(lower_bound=25, upper_bound=1920, default=45),
ct.RangeDim(lower_bound=25, upper_bound=1920, default=45)))
scale = 1/(0.226*255.0)
bias = [- 0.485/(0.229) , - 0.456/(0.224), - 0.406/(0.225)]
# Convert the model with input_shape.
mlmodel = ct.convert(traced_model,
inputs=[ct.ImageType(shape=input_shape, name="input", scale=scale, bias=bias)],
outputs=[ct.TensorType(name="output")],
convert_to="mlprogram",
)
# Save the Core ML model
mlmodel.save("image_resize_model.mlpackage")
It converts OK but when I predict the result with an image It will get the error as below:
You will not be able to run predict() on this Core ML model. Underlying exception message was: {
NSLocalizedDescription = "Failed to build the model execution plan using a model architecture file '/private/var/folders/8z/vtz02xrj781dxvz1v750skz40000gp/T/model-small.mlmodelc/model.mil' with error code: -7.";
}
Where did I do wrong?
Hi,
I want to create a real time sports analytics app that takes camera input and records basketball stats. I want to use pose estimation and object classification to record things such as dribbles, when the ball leaves one's hands. etc.
Is it possible to have a model in CoreML that performs pose estimation on people but also does just simple object detection on other classes (ie. ball, hoop?)
Thanks
Deploy machine learning and AI models on-device with Core ML say the performance report can see the ops run on which unit and why it cannot run on Neural Engine.
I tested my model and the report shows a gray checkmark at the Neural Engine, indicating it can run on the Neural Engine. However, it's not executing on the Neural Engine but on the CPU. Why is this happening?
I have a couple of models that I want to migrate to .mlpackage but can not find the resources of the session:
https://developer.apple.com/videos/play/wwdc2024/10159/
In the video at 21:10 talk about modifications and optimizations, but in the video can not even see the dependencies of the demo.
Thanks
Hi all,
I'm trying to build a scam detection in Message Filter powered by CoreML. I find the predictions of ML reliable and the solution for text frauds and scams are sorely needed.
I was able to create a trained MLModel and deploy it in the app. It works on my container app, but when I try to use and initialise the model in the Message Filter extension, I get an error;
initialization of text classifier model with model data failed
I have tried putting the model in the container app, extension, even made a shared framework for container and extension but to no avail. Every time I invoke the codes to init my model from the extension, I am met with the same error.
Here's my code for initializing the model
do {
let model = try Ace_v24_6(configuration: .init())
let output = try model.prediction(text: text)
guard !output.label.isEmpty else {
return nil
}
return MessagePrediction(rawValue: output.label)
} catch {
return nil
}
My question is: Is it impossible to use CoreML in MessageFilters?
Cheers
Hi there,
I am trying to create a Message Filter app that uses a trained Text Classification to predict scam texts (as it is common in my country and is constantly evolving).
However, when I try to use the MLModel in the MessageFilterExtension class, I'm getting
initialization of text classifier model with model data failed
Here's how I initialize my MLModel that is created using Create ML.
do {
let model = try MyModel(configuration: .init())
let output = try model.prediction(text: text)
guard !output.label.isEmpty else {
return nil
}
return MessagePrediction(rawValue: output.label)
} catch {
return nil
}
Is it impossible to use CoreML in Message Filter extensions?
Thank you
I use SoundAnalysis to analyze background sounds and have enabled background permissions. It worked well in previous iOS systems, but a warning appeared in the new iOS18beta version and sound analysis was stopped.
Warning List:
Execution of the command buffer was aborted due to an error during execution. Insufficient Permission (to submit GPU work from background)
[Espresso::handle_ex_plan] exception=Espresso exception: "Generic error": Insufficient Permission (to submit GPU work from background) (00000006:kIOGPUCommandBufferCallbackErrorBackgroundExecutionNotPermitted); code=7 status=-1
Unable to compute the prediction using a neural network model. It can be an invalid input data or broken/unsupported model (error code: -1).
CoreML prediction failed with Error Domain=com.apple.CoreML Code=0 "Failed to evaluate model 0 in pipeline" UserInfo={NSLocalizedDescription=Failed to evaluate model 0 in pipeline, NSUnderlyingError=0x30330e910 {Error Domain=com.apple.CoreML Code=0 "Failed to evaluate model 1 in pipeline" UserInfo={NSLocalizedDescription=Failed to evaluate model 1 in pipeline, NSUnderlyingError=0x303307840 {Error Domain=com.apple.CoreML Code=0 "Unable to compute the prediction using a neural network model. It can be an invalid input data or broken/unsupported model (error code: -1)." UserInfo={NSLocalizedDescription=Unable to compute the prediction using a neural network model. It can be an invalid input data or broken/unsupported model (error code: -1).}}}}}
For some reason YDF does not work with the ARM processor. An issue with mutex and destruction.
Error when trying to generate CoreML performance report, message says
The data couldn't be written because it isn't in the correct format.
Here is the code to replicate the issue
import numpy as np
import coremltools as ct
from coremltools.converters.mil import Builder as mb
import coremltools.converters.mil as mil
w = np.random.normal(size=(256, 128, 1))
wemb = np.random.normal(size=(1, 32000, 128)) # .astype(np.float16)
rope_emb = np.random.normal(size=(1, 2048, 128))
shapes = [(1, seqlen) for seqlen in (32, 64)]
enum_shape = mil.input_types.EnumeratedShapes(shapes=shapes)
fixed_shape = (1, 128)
max_length = 2048
dtype = np.float32
@mb.program(
input_specs=[
# mb.TensorSpec(enum_shape.symbolic_shape, dtype=mil.input_types.types.int32),
mb.TensorSpec(enum_shape.symbolic_shape, dtype=mil.input_types.types.int32),
],
opset_version=mil.builder.AvailableTarget.iOS17,
)
def flex_like(input_ids):
indices = mb.fill_like(ref_tensor=input_ids, value=np.array(1, dtype=np.int32))
causal_mask = np.expand_dims(
np.triu(np.full((max_length, max_length), -np.inf, dtype=dtype), 1),
axis=0,
)
mask = mb.gather(
x=causal_mask,
indices=indices,
axis=2,
batch_dims=1,
name="mask_gather_0",
)
# mask = mb.gather(
# x=mask, indices=indices, axis=1, batch_dims=1, name="mask_gather_1"
# )
rope = mb.gather(x=rope_emb.astype(dtype), indices=indices, axis=1, batch_dims=1, name="rope")
hidden_states = mb.gather(x=wemb.astype(dtype), indices=input_ids, axis=1, batch_dims=1, name="embedding")
return (
hidden_states,
mask,
rope,
)
cml_flex_like = ct.convert(
flex_like,
compute_units=ct.ComputeUnit.ALL,
compute_precision=ct.precision.FLOAT32,
minimum_deployment_target=ct.target.iOS17,
inputs=[
ct.TensorType(name="input_ids", shape=enum_shape),
],
)
cml_flex_like.save("flex_like_32")
If I remove the hidden states from the return it does work, and it also works if I keep the hidden states, but remove both mask, and rope, i.e, the report is generated for both programs with either these returns:
return (
# hidden_states,
mask,
rope,
)
and
return (
hidden_states,
# mask,
# rope,
)
It also works if I use a static shape instead of an EnumeratedShape
I'm using macOS 15.0 and Xcode 16.0
Edit 1:
Forgot to mention that although the performance report fails, the model is still able to make predictions
I have a model that uses ‘flatten’, and when I converted it to a Core ML model and ran it on Xcode with an iPhone XR, I noticed that ‘flatten’ was automatically converted to ‘reshape’. However, the NPU does not support ‘reshape’.
howerver, I got the Resnet50 model on apple models and performance it on XCode with the same iphone XR, I can see the 'flatten' operator which run on NPU.
On the other hand, when I used the following code to convert ResNet50 in PyTorch and ran it on Xcode Performance, the ‘flatten’ operation was converted to ‘reshape’, which then ran on the CPU.
? So I dont know how to keep 'flatten' operator when convert to ml model ?
coreml tool 7.1
iphone XR
ios 17.5.1
from torchvision import models
import coremltools as ct
import torch
import torch.nn as nn
network_name = "my_resnet50"
torch_model = models.resnet50(pretrained=True)
torch_model.eval()
width = 224
height = 224
example_input = torch.rand(1, 3, height, width)
traced_model = torch.jit.trace(torch_model, (example_input))
model = ct.convert(
traced_model,
convert_to = "neuralnetwork",
inputs=[
ct.TensorType(
name = "data",
shape = example_input.shape,
dtype = np.float32
)
],
outputs = [
ct.TensorType(
name = "output",
dtype = np.float32
)
],
compute_units = ct.ComputeUnit.CPU_AND_NE,
minimum_deployment_target = ct.target.iOS14,
)
model.save("my_resnet.mlmodel")
ResNet50 on Resnet50.mlmodel
My Convertion of ResNet50
Hi, I'd like to use a MLX model already in the MLX Community in my App. I understand I first need to convert it to Core ML format.
Is there an easy way to do that considering MLX is an Apple project?
It would be good if it was easier then I'd be more motivated to use MLX to train my models.
Thanks
Richard
The DINO v1/v2 models are particularly interesting to me as they produce embeddings for the detected objects rather than ordinary classification indexes.
That makes them so much more useful than the CNN based models.
I would like to prepare some of the models posted on Huggingface to run on Apple Silicon, but it seems that the default conversion with TorchScript will not work. The other default conversions I've looked at so far also don't work. Conversion based on an example input doesn't capture enough of the model.
I know that some have managed to convert it as I have a demo with a coreml model that seems to work, but I would like to know how to do the conversion myself.
Has anyone managed to convert any of the DINOv2 models?
I am testing the new scaled dot product attention CoreML op on macOS 15 beta 1. Based on the session video I was expecting to see a speedup when running on GPU however I see roughly equivalent performance to the same model on macOS 14.
I ran tests with two models:
one that simply repeats y = sdpa(y, k, v) 50 times
gpt2 124M converted from nanoGPT (the only change is not returning loss from the forward method)
I converted both models using coremltools 8.0b1 with minimum deployment targets of macOS 14 and also macOS 15. In Xcode, I can see that the new op was used for the macOS 15 target. Running on macOS 15 both target models take the same time, and that time matches the runtime on macOS 14.
Should I be seeing performance improvements?
Here is an App using CoreML API with ML package format, it works fine in iOS17, while it is crashed when calling [MLModel modelWithContentsOfURL ] to load model running in iOS18. It seems an exception is raised "Failed to set compute_device_types_mask E5RT: Cannot provide zero compute device types. (1)". Is it a bug of iOS18 beta version , and it will be fixed in the future?
The stack is as below:
Exception Codes: #0 at 0x1e9280254
Crashed Thread: 49
Application Specific Information:
*** Terminating app due to uncaught exception 'NSGenericException', reason: 'Failed to set compute_device_types_mask E5RT: Cannot provide zero compute device types. (1)'
Last Exception Backtrace:
0 CoreFoundation 0x0000000199466418 __exceptionPreprocess + 164
1 libobjc.A.dylib 0x00000001967cde88 objc_exception_throw + 76
2 CoreFoundation 0x0000000199560794 -[NSException initWithCoder:]
3 CoreML 0x00000001b4fcfa8c -[MLE5ProgramLibraryOnDeviceAOTCompilationImpl createProgramLibraryHandleWithRespecialization:error:] + 1584
4 CoreML 0x00000001b4fcf3cc -[MLE5ProgramLibrary _programLibraryHandleWithForceRespecialization:error:] + 96
5 CoreML 0x00000001b4fc23d8 __44-[MLE5ProgramLibrary prepareAndReturnError:]_block_invoke + 60
6 libdispatch.dylib 0x00000001a12e1160 _dispatch_client_callout + 20
7 libdispatch.dylib 0x00000001a12f07b8 _dispatch_lane_barrier_sync_invoke_and_complete + 56
8 CoreML 0x00000001b4fc3e98 -[MLE5ProgramLibrary prepareAndReturnError:] + 220
9 CoreML 0x00000001b4fc3bc0 -[MLE5Engine initWithContainer:configuration:error:] + 220
10 CoreML 0x00000001b4fc3888 +[MLE5Engine loadModelFromCompiledArchive:modelVersionInfo:compilerVersionInfo:configuration:error:] + 344
11 CoreML 0x00000001b4faf53c +[MLLoader _loadModelWithClass:fromArchive:modelVersionInfo:compilerVersionInfo:configuration:error:] + 364
12 CoreML 0x00000001b4faedd4 +[MLLoader _loadModelFromArchive:configuration:modelVersion:compilerVersion:loaderEvent:useUpdatableModelLoaders:loadingClasses:error:] + 540
13 CoreML 0x00000001b4f9b900 +[MLLoader _loadWithModelLoaderFromArchive:configuration:loaderEvent:useUpdatableModelLoaders:error:] + 424
14 CoreML 0x00000001b4faaeac +[MLLoader _loadModelFromArchive:configuration:loaderEvent:useUpdatableModelLoaders:error:] + 460
15 CoreML 0x00000001b4fb0428 +[MLLoader _loadModelFromAssetAtURL:configuration:loaderEvent:error:] + 240
16 CoreML 0x00000001b4fb00c4 +[MLLoader loadModelFromAssetAtURL:configuration:error:] + 104
17 CoreML 0x00000001b5314118 -[MLModelAssetResourceFactoryOnDiskImpl modelWithConfiguration:error:] + 116
18 CoreML 0x00000001b5418cc0 __60-[MLModelAssetResourceFactory modelWithConfiguration:error:]_block_invoke + 72
19 libdispatch.dylib 0x00000001a12e1160 _dispatch_client_callout + 20
20 libdispatch.dylib 0x00000001a12f07b8 _dispatch_lane_barrier_sync_invoke_and_complete + 56
21 CoreML 0x00000001b5418b94 -[MLModelAssetResourceFactory modelWithConfiguration:error:] + 276
22 CoreML 0x00000001b542919c -[MLModelAssetModelVendor modelWithConfiguration:error:] + 152
23 CoreML 0x00000001b5380ce4 -[MLModelAsset modelWithConfiguration:error:] + 112
24 CoreML 0x00000001b4fb0b3c +[MLModel modelWithContentsOfURL:configuration:error:] + 168
I'm trying to convert a TensorFlow model that I didn't create and know approximately nothing about to CoreML so that I can use it in some functional tests. I can't tell you much about the model, but you can read about it on the blog from the team that created it: https://research.google/blog/improving-mobile-app-accessibility-with-icon-detection/
I can't convert this model to a TensorFlow Lite model because it uses a few full TensorFlow operations (which I could work around) and it exceeds the 4-tensor output limit (which I can't, AFAIK). So instead, I'm trying to convert the model to CoreML so that I can run it on-device.
The issue I'm running into is that every approach fails in different ways. If I load the model with tf.saved_model.load and pass that as the first parameter to the convert call, it says
NotImplementedError: Expected model format: [SavedModel | concrete_function | tf.keras.Model | .h5 | GraphDef], got <tensorflow.python.trackable.autotrackable.AutoTrackable object at 0x30d90c250>
If I pass model.signatures['serving_default'] as the first parameter to convert, I get
NotImplementedError: Expected model format: [SavedModel | concrete_function | tf.keras.Model | .h5 | GraphDef], got ConcreteFunction [...a page or two of info about the function here...]
If I try to wrap it in a Keras layer using the instructions provided in the converter, it fails because a sequential model can't have multiple outputs.
If I try to use a tf.keras.layers.TFSMLayer to load the model, it fails because there are multiple tags, and there's no way to specify tags when constructing the layer. (It tells me that I need to add 'tags' to load the model, but if I do that, it tells me that tags isn't a valid parameter to the call.)
If I load the model with tf.saved_model.load and specify a single tag, then re-save it in a different location with tf.saved_model.save to generate a new model with only a single tag, then do
input_layer = tf.keras.Input(shape=(768, 768, 3), dtype="int8")
layer = tf.keras.layers.TFSMLayer("./serve_model", call_endpoint='serving_default')
outputs = layer(input_layer)
model = tf.keras.Model(input_layer, outputs)
I get
AttributeError: 'Functional' object has no attribute '_get_save_spec'
At one point, I also tried this:
class LayerFromSavedModel(tf.keras.layers.Layer):
def __init__(self):
super(LayerFromSavedModel, self).__init__()
self.vars = legacy_model.variables
def call(self, inputs):
return legacy_model.signatures['serving_default'](inputs)
input = tf.keras.Input(shape=(3000, 3000, 3))
model = tf.keras.Model(input, LayerFromSavedModel()(input))
and saw a similar failure.
I've run out of ideas here. Is there simply no support whatsoever in the converter for importing a TensorFlow 2 SavedModel into CoreML, or am I missing something fundamental?
We are currently working on implementing a baby cry detection model in the frontend of our app but have encountered some challenges with the mel spectrogram transformation.
Our mel spectrogram class, developed in python, leverages librosa for generating mel spectrograms (librosa.feature.melspectrogram and librosa.power_to_db). While we have successfully exported the model to a .mlmodel file, the results we obtain in Swift differ significantly from those generated by our Python code.
Could this discrepancy be due to the use of librosa in Python, which might not be directly compatible with Swift? Or should the transformation process be inherently consistent once exported to a .mlmodel file?