I have several CoreML models that I've set up to run in sequence where one of the outputs from each model is passed as one of the inputs to the next.
For the most part, there is very little overhead in between each sub-model "chunk":
However a couple of the models (eg the first two above) spend a noticeable amount of time in "Prepare Neural Engine Request". From Instruments, it seems like this is spent doing some sort of model loading.
Given that I'm calling these models in sequence and in a fixed order, is there some way to reduce or amortize this cost? Thanks!
Core ML
RSS for tagIntegrate machine learning models into your app using Core ML.
Posts under Core ML tag
121 Posts
Sort by:
Post
Replies
Boosts
Views
Activity
I was trying the latest coremltools-8.0b1 beta on macOS 15 Beta with the intent to try using the new stateful models api in CoreML.
But the conversion would always fail with the error:
/AppleInternal/Library/BuildRoots/<snip>/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphExecutable.mm:162: failed assertion `Error: the minimum deployment target for macOS is 14.0.0'
Here's a minimal repro, which works fine with both the stable version of coremltools (7.2) and the beta version (8.0b1) on macOS Sonoma 14.5, but fails with both versions of coremltools on macOS 15.0 Beta and Xcode 16.0 Beta. Which means that this most likely isn't an issue with coremltools, but with the native compilation toolchain.
from collections import OrderedDict
import coremltools as ct
import numpy as np
import torch
import torch.nn as nn
class ResidualAttentionBlock(nn.Module):
def __init__(self, d_model: int, n_head: int, attn_mask: torch.Tensor = None):
super().__init__()
self.attn = nn.MultiheadAttention(d_model, n_head)
self.ln_1 = nn.LayerNorm(d_model)
self.mlp = nn.Sequential(
OrderedDict(
[
("c_fc", nn.Linear(d_model, d_model * 4)),
("gelu", nn.GELU()),
("c_proj", nn.Linear(d_model * 4, d_model)),
]
)
)
self.ln_2 = nn.LayerNorm(d_model)
self.attn_mask = attn_mask
def attention(self, x: torch.Tensor):
self.attn_mask = (
self.attn_mask.to(dtype=x.dtype, device=x.device)
if self.attn_mask is not None
else None
)
return self.attn(x, x, x, need_weights=False, attn_mask=self.attn_mask)[0]
def forward(self, x: torch.Tensor):
x = x + self.attention(self.ln_1(x))
x = x + self.mlp(self.ln_2(x))
return x
class Transformer(nn.Module):
def __init__(
self, width: int, layers: int, heads: int, attn_mask: torch.Tensor = None
):
super().__init__()
self.width = width
self.layers = layers
self.resblocks = nn.Sequential(
*[ResidualAttentionBlock(width, heads, attn_mask) for _ in range(layers)]
)
def forward(self, x: torch.Tensor):
return self.resblocks(x)
transformer = Transformer(width=512, layers=12, heads=8)
emb_tokens = torch.rand((1, 512))
ct_model = ct.convert(
torch.jit.trace(transformer.eval(), emb_tokens),
convert_to="mlprogram",
minimum_deployment_target=ct.target.macOS14,
inputs=[ct.TensorType(name="embIn", shape=[1, 512])],
outputs=[ct.TensorType(name="embOutput", dtype=np.float32)],
)
I want to try an any resolution image input Core ML model.
So I wrote the model following the Core ML Tools "Set the Range for Each Dimensionas" sample code, modified as below:
# Trace the model with random input.
example_input = torch.rand(1, 3, 50, 50)
traced_model = torch.jit.trace(model.eval(), example_input)
# Set the input_shape to use RangeDim for each dimension.
input_shape = ct.Shape(shape=(1,
3,
ct.RangeDim(lower_bound=25, upper_bound=1920, default=45),
ct.RangeDim(lower_bound=25, upper_bound=1920, default=45)))
scale = 1/(0.226*255.0)
bias = [- 0.485/(0.229) , - 0.456/(0.224), - 0.406/(0.225)]
# Convert the model with input_shape.
mlmodel = ct.convert(traced_model,
inputs=[ct.ImageType(shape=input_shape, name="input", scale=scale, bias=bias)],
outputs=[ct.TensorType(name="output")],
convert_to="mlprogram",
)
# Save the Core ML model
mlmodel.save("image_resize_model.mlpackage")
It converts OK but when I predict the result with an image It will get the error as below:
You will not be able to run predict() on this Core ML model. Underlying exception message was: {
NSLocalizedDescription = "Failed to build the model execution plan using a model architecture file '/private/var/folders/8z/vtz02xrj781dxvz1v750skz40000gp/T/model-small.mlmodelc/model.mil' with error code: -7.";
}
Where did I do wrong?
I'm looking for a solution to take a picture or point the camera at a piece of clothing and match that image with an image the user has stored in my app.
I'm storing the data in a Core Data database as a Binary Data object. Since the user also takes the pictures they store in the database I think I cannot use pre-trained Core ML models.
I would like the matching to be done on device if possible instead of going to an external service. That will probably describe the item based on what the AI sees, but then I cannot match the item with the stored images in the app.
Does anyone know if this is possible with frameworks as Vision or VisionKit?
Deploy machine learning and AI models on-device with Core ML say the performance report can see the ops run on which unit and why it cannot run on Neural Engine.
I tested my model and the report shows a gray checkmark at the Neural Engine, indicating it can run on the Neural Engine. However, it's not executing on the Neural Engine but on the CPU. Why is this happening?
I have a couple of models that I want to migrate to .mlpackage but can not find the resources of the session:
https://developer.apple.com/videos/play/wwdc2024/10159/
In the video at 21:10 talk about modifications and optimizations, but in the video can not even see the dependencies of the demo.
Thanks
I have created a Hand Action Classification project following the guidelines which causes the Create ML tool to provide the very cryptic "Unexpected Error". The feature extraction phase is fine, with the error occurring during model training after the tool reports completion of the first five training iterations as it moves on to report the next ten.
The project is small with 3 training classes and 346 items
I have tried to vary the frame rate and action duration, with all augmentations unset, but the error still persists.
Can you please confirm how I may get further error diagnostic information so that I can determine why Create ML is unable to work with my training data?
Mac OS is Sonoma 14.5 on an iMac 24-inch, M1, 2021. Create ML is Version 5.0 (121.4)
Hi there,
I am trying to create a Message Filter app that uses a trained Text Classification to predict scam texts (as it is common in my country and is constantly evolving).
However, when I try to use the MLModel in the MessageFilterExtension class, I'm getting
initialization of text classifier model with model data failed
Here's how I initialize my MLModel that is created using Create ML.
do {
let model = try MyModel(configuration: .init())
let output = try model.prediction(text: text)
guard !output.label.isEmpty else {
return nil
}
return MessagePrediction(rawValue: output.label)
} catch {
return nil
}
Is it impossible to use CoreML in Message Filter extensions?
Thank you
Trouble with Core ML Object Tracking for Spherical Objects Using WWDC Sample Code and Object Capture
Hi everyone,
I'm working with Core ML for object tracking and have successfully trained a couple of models. However, when I try to use the reference object file in the object tracking sample code from the WWDC video, it doesn't work.
I'm training the model on a single-color plastic spherical object. Could this be the reason for the issue? I also attempted to use USDZ 3D assets that resemble the real object. Do these need to be trained with the Object Capture app to work properly?
Speaking of the Object Capture app, my experience hasn't been great. When I scanned my spherical object, the result was far from a sphere—it looked more like a mushy dough.
Any insights or suggestions would be greatly appreciated!
Hi everyone,
I was wondering, on how accurate is the Hand Classification ML? For Example: Is it possible to understand the different letters of the Sign Language Alphabet or is it only capable of recognizing simple poses like a thumbs up?
Error when trying to generate CoreML performance report, message says
The data couldn't be written because it isn't in the correct format.
Here is the code to replicate the issue
import numpy as np
import coremltools as ct
from coremltools.converters.mil import Builder as mb
import coremltools.converters.mil as mil
w = np.random.normal(size=(256, 128, 1))
wemb = np.random.normal(size=(1, 32000, 128)) # .astype(np.float16)
rope_emb = np.random.normal(size=(1, 2048, 128))
shapes = [(1, seqlen) for seqlen in (32, 64)]
enum_shape = mil.input_types.EnumeratedShapes(shapes=shapes)
fixed_shape = (1, 128)
max_length = 2048
dtype = np.float32
@mb.program(
input_specs=[
# mb.TensorSpec(enum_shape.symbolic_shape, dtype=mil.input_types.types.int32),
mb.TensorSpec(enum_shape.symbolic_shape, dtype=mil.input_types.types.int32),
],
opset_version=mil.builder.AvailableTarget.iOS17,
)
def flex_like(input_ids):
indices = mb.fill_like(ref_tensor=input_ids, value=np.array(1, dtype=np.int32))
causal_mask = np.expand_dims(
np.triu(np.full((max_length, max_length), -np.inf, dtype=dtype), 1),
axis=0,
)
mask = mb.gather(
x=causal_mask,
indices=indices,
axis=2,
batch_dims=1,
name="mask_gather_0",
)
# mask = mb.gather(
# x=mask, indices=indices, axis=1, batch_dims=1, name="mask_gather_1"
# )
rope = mb.gather(x=rope_emb.astype(dtype), indices=indices, axis=1, batch_dims=1, name="rope")
hidden_states = mb.gather(x=wemb.astype(dtype), indices=input_ids, axis=1, batch_dims=1, name="embedding")
return (
hidden_states,
mask,
rope,
)
cml_flex_like = ct.convert(
flex_like,
compute_units=ct.ComputeUnit.ALL,
compute_precision=ct.precision.FLOAT32,
minimum_deployment_target=ct.target.iOS17,
inputs=[
ct.TensorType(name="input_ids", shape=enum_shape),
],
)
cml_flex_like.save("flex_like_32")
If I remove the hidden states from the return it does work, and it also works if I keep the hidden states, but remove both mask, and rope, i.e, the report is generated for both programs with either these returns:
return (
# hidden_states,
mask,
rope,
)
and
return (
hidden_states,
# mask,
# rope,
)
It also works if I use a static shape instead of an EnumeratedShape
I'm using macOS 15.0 and Xcode 16.0
Edit 1:
Forgot to mention that although the performance report fails, the model is still able to make predictions
I have a model that uses ‘flatten’, and when I converted it to a Core ML model and ran it on Xcode with an iPhone XR, I noticed that ‘flatten’ was automatically converted to ‘reshape’. However, the NPU does not support ‘reshape’.
howerver, I got the Resnet50 model on apple models and performance it on XCode with the same iphone XR, I can see the 'flatten' operator which run on NPU.
On the other hand, when I used the following code to convert ResNet50 in PyTorch and ran it on Xcode Performance, the ‘flatten’ operation was converted to ‘reshape’, which then ran on the CPU.
? So I dont know how to keep 'flatten' operator when convert to ml model ?
coreml tool 7.1
iphone XR
ios 17.5.1
from torchvision import models
import coremltools as ct
import torch
import torch.nn as nn
network_name = "my_resnet50"
torch_model = models.resnet50(pretrained=True)
torch_model.eval()
width = 224
height = 224
example_input = torch.rand(1, 3, height, width)
traced_model = torch.jit.trace(torch_model, (example_input))
model = ct.convert(
traced_model,
convert_to = "neuralnetwork",
inputs=[
ct.TensorType(
name = "data",
shape = example_input.shape,
dtype = np.float32
)
],
outputs = [
ct.TensorType(
name = "output",
dtype = np.float32
)
],
compute_units = ct.ComputeUnit.CPU_AND_NE,
minimum_deployment_target = ct.target.iOS14,
)
model.save("my_resnet.mlmodel")
ResNet50 on Resnet50.mlmodel
My Convertion of ResNet50
I am testing the new scaled dot product attention CoreML op on macOS 15 beta 1. Based on the session video I was expecting to see a speedup when running on GPU however I see roughly equivalent performance to the same model on macOS 14.
I ran tests with two models:
one that simply repeats y = sdpa(y, k, v) 50 times
gpt2 124M converted from nanoGPT (the only change is not returning loss from the forward method)
I converted both models using coremltools 8.0b1 with minimum deployment targets of macOS 14 and also macOS 15. In Xcode, I can see that the new op was used for the macOS 15 target. Running on macOS 15 both target models take the same time, and that time matches the runtime on macOS 14.
Should I be seeing performance improvements?
Here is an App using CoreML API with ML package format, it works fine in iOS17, while it is crashed when calling [MLModel modelWithContentsOfURL ] to load model running in iOS18. It seems an exception is raised "Failed to set compute_device_types_mask E5RT: Cannot provide zero compute device types. (1)". Is it a bug of iOS18 beta version , and it will be fixed in the future?
The stack is as below:
Exception Codes: #0 at 0x1e9280254
Crashed Thread: 49
Application Specific Information:
*** Terminating app due to uncaught exception 'NSGenericException', reason: 'Failed to set compute_device_types_mask E5RT: Cannot provide zero compute device types. (1)'
Last Exception Backtrace:
0 CoreFoundation 0x0000000199466418 __exceptionPreprocess + 164
1 libobjc.A.dylib 0x00000001967cde88 objc_exception_throw + 76
2 CoreFoundation 0x0000000199560794 -[NSException initWithCoder:]
3 CoreML 0x00000001b4fcfa8c -[MLE5ProgramLibraryOnDeviceAOTCompilationImpl createProgramLibraryHandleWithRespecialization:error:] + 1584
4 CoreML 0x00000001b4fcf3cc -[MLE5ProgramLibrary _programLibraryHandleWithForceRespecialization:error:] + 96
5 CoreML 0x00000001b4fc23d8 __44-[MLE5ProgramLibrary prepareAndReturnError:]_block_invoke + 60
6 libdispatch.dylib 0x00000001a12e1160 _dispatch_client_callout + 20
7 libdispatch.dylib 0x00000001a12f07b8 _dispatch_lane_barrier_sync_invoke_and_complete + 56
8 CoreML 0x00000001b4fc3e98 -[MLE5ProgramLibrary prepareAndReturnError:] + 220
9 CoreML 0x00000001b4fc3bc0 -[MLE5Engine initWithContainer:configuration:error:] + 220
10 CoreML 0x00000001b4fc3888 +[MLE5Engine loadModelFromCompiledArchive:modelVersionInfo:compilerVersionInfo:configuration:error:] + 344
11 CoreML 0x00000001b4faf53c +[MLLoader _loadModelWithClass:fromArchive:modelVersionInfo:compilerVersionInfo:configuration:error:] + 364
12 CoreML 0x00000001b4faedd4 +[MLLoader _loadModelFromArchive:configuration:modelVersion:compilerVersion:loaderEvent:useUpdatableModelLoaders:loadingClasses:error:] + 540
13 CoreML 0x00000001b4f9b900 +[MLLoader _loadWithModelLoaderFromArchive:configuration:loaderEvent:useUpdatableModelLoaders:error:] + 424
14 CoreML 0x00000001b4faaeac +[MLLoader _loadModelFromArchive:configuration:loaderEvent:useUpdatableModelLoaders:error:] + 460
15 CoreML 0x00000001b4fb0428 +[MLLoader _loadModelFromAssetAtURL:configuration:loaderEvent:error:] + 240
16 CoreML 0x00000001b4fb00c4 +[MLLoader loadModelFromAssetAtURL:configuration:error:] + 104
17 CoreML 0x00000001b5314118 -[MLModelAssetResourceFactoryOnDiskImpl modelWithConfiguration:error:] + 116
18 CoreML 0x00000001b5418cc0 __60-[MLModelAssetResourceFactory modelWithConfiguration:error:]_block_invoke + 72
19 libdispatch.dylib 0x00000001a12e1160 _dispatch_client_callout + 20
20 libdispatch.dylib 0x00000001a12f07b8 _dispatch_lane_barrier_sync_invoke_and_complete + 56
21 CoreML 0x00000001b5418b94 -[MLModelAssetResourceFactory modelWithConfiguration:error:] + 276
22 CoreML 0x00000001b542919c -[MLModelAssetModelVendor modelWithConfiguration:error:] + 152
23 CoreML 0x00000001b5380ce4 -[MLModelAsset modelWithConfiguration:error:] + 112
24 CoreML 0x00000001b4fb0b3c +[MLModel modelWithContentsOfURL:configuration:error:] + 168
I created a model that classifies certain objects using yolov8. I noticed that the model is not working properly in my application. While the model works fine in Xcode preview, in the application it either returns the same result with 99% accuracy for each classification or does not provide any result.
In Preview it looks like this:
Predictions:
extension CameraVC : AVCapturePhotoCaptureDelegate {
func photoOutput(_ output: AVCapturePhotoOutput, didFinishProcessingPhoto photo: AVCapturePhoto, error: (any Error)?) {
guard let data = photo.fileDataRepresentation() else {
return
}
guard let image = UIImage(data: data) else {
return
}
guard let cgImage = image.cgImage else {
fatalError("Unable to create CIImage")
}
let handler = VNImageRequestHandler(cgImage: cgImage,orientation: CGImagePropertyOrientation(image.imageOrientation))
DispatchQueue.global(qos: .userInitiated).async {
do {
try handler.perform([self.viewModel.detectionRequest])
} catch {
fatalError("Failed to perform detection: \(error)")
}
}
lazy var detectionRequest: VNCoreMLRequest = {
do {
let model = try VNCoreMLModel(for: bestv720().model)
let request = VNCoreMLRequest(model: model) { [weak self] request, error in
self?.processDetections(for: request, error: error)
}
request.imageCropAndScaleOption = .centerCrop
return request
} catch {
fatalError("Failed to load Vision ML model: \(error)")
}
}()
This is where i print recognized objects:
func processDetections(for request: VNRequest, error: Error?) {
DispatchQueue.main.async {
guard let results = request.results as? [VNRecognizedObjectObservation] else {
return
}
var label = ""
var all_results = []
var all_confidence = []
var true_results = []
var true_confidence = []
for result in results {
for i in 0...results.count{
all_results.append(result.labels[i].identifier)
all_confidence.append(result.labels[i].confidence)
for confidence in all_confidence {
if confidence as! Float > 0.7 {
true_results.append(result.labels[i].identifier)
true_confidence.append(confidence)
}
}
}
label = result.labels[0].identifier
}
print("True Results " , true_results)
print("True Confidence ", true_confidence)
self.output?.updateView(label:label)
}
}
I converted the model like this:
from ultralytics import YOLO
model = YOLO(model_path)
model.export(format='coreml', nms=True, imgsz=[720,1280])
The What’s New in Create ML session in WWDC24 went into great depth with time-series forecasting models (beginning at: 15:14) and mentioned these new models, capabilities, and tools for iOS 18. So, far, all I can find is API documentation. I don’t see any other session in WWDC24 covering these new time-series forecasting Create ML features.
Is there more substance/documentation on how to use these with Create ML? Maybe I am looking in the wrong place but I am fairly new with ML.
Are there any food truck / donut shop demo/sample code like in the video?
It is of great interest to get ahead of the curve on this within business applications that may take advantage of this with inventory / ordering data.
I wanted to try the new Object Tracking. So I captured an object and uploaded it in Create ML in an Object Tracking project. I added my used file and set the viewing angle to "upright". Afterwards, I startet the training. After some time (around 2%), it stops and I receive following error message: "There isn't enough space. at "1921.blob". I tried it multiple times.
From https://www.apple.com/newsroom/2024/06/introducing-apple-intelligence-for-iphone-ipad-and-mac/:
Powered by Apple Intelligence, Siri becomes more deeply integrated into the system experience. With richer language-understanding capabilities, Siri is more natural, more contextually relevant, and more personal, with the ability to simplify and accelerate everyday tasks.
From https://developer.apple.com/apple-intelligence/:
Siri is more natural, more personal, and more deeply integrated into the system. Apple Intelligence provides Siri with enhanced action capabilities, and developers can take advantage of pre-defined and pre-trained App Intents across a range of domains to not only give Siri the ability to take actions in your app, but to make your app’s actions more discoverable in places like Spotlight, the Shortcuts app, Control Center, and more. SiriKit adopters will benefit from Siri’s enhanced conversational capabilities with no additional work. And with App Entities, Siri can understand content from your app and provide users with information from your app from anywhere in the system.
Based on this, as well as the video at https://developer.apple.com/videos/play/wwdc2024/10133/ , my understanding is that in order for Siri to be able to execute tasks in applications, those applications must implement the Siri Intents API.
Can someone at Apple please clarify: will it be possible for Siri or some other aspect of Apple Intelligence / Core ML / Create ML to take actions in applications which do not support these APIs (e.g. web apps, Citrix apps, legacy apps)?
Thank you!
I am currently working on a 2D pose estimator. I developed a PyTorch vision transformer based model with 17 joints in COCO format for the same and then converted it to CoreML using CoreML tools version 6.2.
The model was trained on a custom dataset. However, upon running the converted model on iOS, I observed a significant drop in accuracy. You can see it in this video (https://youtu.be/EfGFrOZQGtU) that demonstrates the outputs of the PyTorch model (on the left) and the CoreML model (on the right).
Could you please confirm if this drop in accuracy is expected and suggest any possible solutions to address this issue? Please note that all preprocessing and post-processing techniques remain consistent between the models.
P.S. While converting I also got the following warning. :
TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if x.numel() == 0 and obsolete_torch_version(TORCH_VERSION, (1, 4)):
P.P.S. When we initialize the CoreML model on iOS 17.0, we get this error:
Validation failure: Invalid Pool kernel width (13), must be [1-8] or 20.
Validation failure: Invalid Pool kernel width (9), must be [1-8] or 20.
Validation failure: Invalid Pool kernel width (13), must be [1-8] or 20.
Validation failure: Invalid Pool kernel width (9), must be [1-8] or 20.
Validation failure: Invalid Pool kernel width (13), must be [1-8] or 20.
This neural network model does not have a parameter for requested key 'precisionRecallCurves'. Note: only updatable neural network models can provide parameter values and these values are only accessible in the context of an MLUpdateTask completion or progress handler.
I made a model using pytorch and then converted it into a mlmodel file. Next I tried and downloaded (https://developer.apple.com/documentation/vision/recognizing_objects_in_live_capture) which worked! But when I changed the model to my model that I made, the camera worked, but no predictions where shown please
h
elp!