Xcode Version: Version 15.2 (15C500b)
com.github.apple.coremltools.source: torch==1.12.1
com.github.apple.coremltools.version: 7.2
Compute: Mixed (Float16, Int32)
Storage: Float16
The input to the mlpackage is MultiArray (Float16 1 × 1 × 544 × 960)
The flexibility is: 1 × 1 × 544 × 960 | 1 × 1 × 384 × 640 | 1 × 1 × 736 × 1280 | 1 × 1 × 1088 × 1920
I tested this on iPhone XR, iPhone 11, iPhone 12, iPhone 13, and iPhone 14. On all devices except the iPhone 11, the model runs correctly on the NPU. However, on the iPhone 11, the model runs on the CPU instead.
Here is the CoreMLTools conversion code I used:
mlmodel = ct.convert(trace,
inputs=[ct.TensorType(shape=input_shape, name="input", dtype=np.float16)],
outputs=[ct.TensorType(name="output", dtype=np.float16, shape=output_shape)],
convert_to='mlprogram',
minimum_deployment_target=ct.target.iOS16
)
Core ML
RSS for tagIntegrate machine learning models into your app using Core ML.
Post
Replies
Boosts
Views
Activity
I want my confidence of model is worked according to the when I detected the object by real time camera with help of ml model in android its gives me different results with different confidence as like 75, 40,30,95 not range 95 to 100 but when I used same model in ios its will give me range above 95 of any case. so what will be reason do you think
When I try to run basically any CoreML model using MLPredictionOptions.outputBackings , inference throws the following error:
2024-09-11 15:36:00.184740-0600 run_demo[4260:64822] [coreml] Unrecognized ANE execution priority (null)
2024-09-11 15:36:00.185380-0600 run_demo[4260:64822] *** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: 'Unrecognized ANE execution priority (null)'
*** First throw call stack:
(
0 CoreFoundation 0x000000019812cec0 __exceptionPreprocess + 176
1 libobjc.A.dylib 0x0000000197c12cd8 objc_exception_throw + 88
2 CoreFoundation 0x000000019812cdb0 +[NSException exceptionWithName:reason:userInfo:] + 0
3 CoreML 0x00000001a1bf6504 _ZN12_GLOBAL__N_141espressoPlanPriorityFromPredictionOptionsEP19MLPredictionOptions + 264
4 CoreML 0x00000001a1bf68c0 -[MLNeuralNetworkEngine _matchEngineToOptions:error:] + 236
5 CoreML 0x00000001a1be254c __62-[MLNeuralNetworkEngine predictionFromFeatures:options:error:]_block_invoke + 68
6 libdispatch.dylib 0x0000000197e20658 _dispatch_client_callout + 20
7 libdispatch.dylib 0x0000000197e2fcd8 _dispatch_l
*** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: 'Unrecognized ANE execution priority (null)'
*** First throw call stack:
(
0 CoreFoundation 0x000000019812cec0 __exceptionPreprocess + 176
1 libobjc.A.dylib 0x0000000197c12cd8 objc_exception_throw + 88
2 CoreFoundation 0x000000019812cdb0 +[NSException exceptionWithName:reason:userInfo:] + 0
3 CoreML 0x00000001a1bf6504 _ZN12_GLOBAL__N_141espressoPlanPriorityFromPredictionOptionsEP19MLPredictionOptions + 264
4 CoreML 0x00000001a1bf68c0 -[MLNeuralNetworkEngine _matchEngineToOptions:error:] + 236
5 CoreML 0x00000001a1be254c __62-[MLNeuralNetworkEngine predictionFromFeatures:options:error:]_block_invoke + 68
6 libdispatch.dylib 0x0000000197e20658 _dispatch_client_callout + 20
7 libdispatch.dylib 0x0000000197e2fcd8 _dispatch_lane_barrier_sync_invoke_and_complete + 56
8 CoreML 0x00000001a1be2450 -[MLNeuralNetworkEngine predictionFromFeatures:options:error:] + 304
9 CoreML 0x00000001a1c9e118 -[MLDelegateModel _predictionFromFeatures:usingState:options:error:] + 776
10 CoreML 0x00000001a1c9e4a4 -[MLDelegateModel predictionFromFeatures:options:error:] + 136
11 libMLBackend_coreml.dylib 0x00000001002f19f0 _ZN6CoreML8runModelENS_5ModelERNSt3__16vectorIPvNS1_9allocatorIS3_EEEES7_ + 904
12 libMLBackend_coreml.dylib 0x00000001002c56e8 _ZZN8ModelImp9runCoremlEPN2ML7Backend17ModelIoBindingImpEENKUlvE_clEv + 120
13 libMLBackend_coreml.dylib 0x00000001002c1e40 _ZNSt3__110__function6__funcIZN2ML4Util10WorkThread11runInThreadENS_8functionIFvvEEEEUlvE_NS_9allocatorIS8_EES6_EclEv + 40
14 libMLBackend_coreml.dylib 0x00000001002bc3a4 _ZZN2ML4Util10WorkThreadC1EvENKUlvE_clEv + 160
15 libMLBackend_coreml.dylib 0x00000001002bc244 _ZNSt3__114__thread_proxyB7v160006INS_5tupleIJNS_10unique_ptrINS_15__thread_structENS_14default_deleteIS3_EEEEZN2ML4Util10WorkThreadC1EvEUlvE_EEEEEPvSC_ + 52
16 libsystem_pthread.dylib 0x0000000197fd32e4 _pthread_start + 136
17 libsystem_pthread.dylib 0x0000000197fce0fc thread_start + 8
)
libc++abi: terminating due to uncaught exception of type NSException
Interestingly, if I don't use MLPredictionOptions to set pre-allocated output backings, then inference appears to run as expected.
A similar issue seems to have been discussed and fixed here: https://developer.apple.com/forums/thread/761649 , however I'm seeing this issue on a beta build that I downloaded today (Sept 11 2024).
Will this be fixed? Any advice would be greatly appreciated.
Thanks
Recently, deep learning model have been getting larger, and sometimes loading models has become a bottleneck. I download the .mlpackage format CoreML from the internet and need to use compileModelAtURL to convert the .mlpackage into an .mlmodelc, then call modelWithContentsOfURL to convert the .mlmodelc into a handle. Generally, generating a handle with modelWithContentsOfURL is very slow. I noticed from WWDC 2023 that it is possible to cache the compiled results (see https://developer.apple.com/videos/play/wwdc2023/10049/?time=677, which states "This compilation includes further optimizations for the specific compute device and outputs an artifact that the compute device can run. Once complete, Core ML caches these artifacts to be used for subsequent model loads."). However, it seems that I couldn't find how to cache in the documentation.
We have a code that crashed The crash stack is as follows
Thread 26 Crashed:
0 CoreFoundation 0x0000000198b0569c CFRelease + 44
1 CoreFoundation 0x0000000198b12334 __CFBasicHashRehash + 1172
2 CoreFoundation 0x0000000198b015dc __CFBasicHashAddValue + 100
3 CoreFoundation 0x0000000198b232e4 CFDictionarySetValue + 208
4 Foundation 0x00000001979b0378 _getStringAtMarker + 464
5 Foundation 0x00000001979b016c _NSXPCSerializationStringForObject + 56
6 Foundation 0x00000001979cec4c __44-[NSXPCDecoder _decodeArrayOfObjectsForKey:]_block_invoke + 52
7 Foundation 0x00000001979ceb90 _NSXPCSerializationIterateArrayObject + 208
8 Foundation 0x00000001979cda7c -[NSXPCDecoder _decodeArrayOfObjectsForKey:] + 240
9 Foundation 0x00000001979cd1bc -[NSDictionary(NSDictionary) initWithCoder:] + 176
10 Foundation 0x00000001979ae6e8 _decodeObject + 1264
11 Foundation 0x00000001979cec4c __44-[NSXPCDecoder _decodeArrayOfObjectsForKey:]_block_invoke + 52
12 Foundation 0x00000001979ceb90 _NSXPCSerializationIterateArrayObject + 208
13 Foundation 0x00000001979cda7c -[NSXPCDecoder _decodeArrayOfObjectsForKey:] + 240
14 Foundation 0x00000001979cd1a4 -[NSDictionary(NSDictionary) initWithCoder:] + 152
15 Foundation 0x00000001979ae6e8 _decodeObject + 1264
16 Foundation 0x00000001979ad030 -[NSXPCDecoder _decodeObjectOfClasses:atObject:] + 148
17 Foundation 0x0000000197a0a7f0 _NSXPCSerializationDecodeTypedObjCValuesFromArray + 892
18 Foundation 0x0000000197a0a1f8 _NSXPCSerializationDecodeInvocationArgumentArray + 412
19 Foundation 0x0000000197a0866c -[NSXPCDecoder __decodeXPCObject:allowingSimpleMessageSend:outInvocation:outArguments:outArgumentsMaxCount:outMethodSignature:outSelector:isReply:replySelector:] + 700
20 Foundation 0x0000000197a61078 -[NSXPCDecoder _decodeReplyFromXPCObject:forSelector:] + 76
21 Foundation 0x0000000197a5f690 -[NSXPCConnection _decodeAndInvokeReplyBlockWithEvent:sequence:replyInfo:] + 252
22 Foundation 0x0000000197a63664 __88-[NSXPCConnection _sendInvocation:orArguments:count:methodSignature:selector:withProxy:]_block_invoke_5 + 188
23 Foundation 0x0000000197a08058 -[NSXPCConnection _sendInvocation:orArguments:count:methodSignature:selector:withProxy:] + 2244
24 CoreFoundation 0x0000000198b19d88 ___forwarding___ + 1016
25 CoreFoundation 0x0000000198b198d0 _CF_forwarding_prep_0 + 96
26 AppleNeuralEngine 0x00000001e912ab1c -[_ANEDaemonConnection loadModel:sandboxExtension:options:qos:withReply:] + 332
27 AppleNeuralEngine 0x00000001e912a674 __44-[_ANEClient doLoadModel:options:qos:error:]_block_invoke + 360
28 libdispatch.dylib 0x00000001a0a21dd4 _dispatch_client_callout + 20
29 libdispatch.dylib 0x00000001a0a312c4 _dispatch_lane_barrier_sync_invoke_and_complete + 56
30 AppleNeuralEngine 0x00000001e9129ef0 -[_ANEClient doLoadModel:options:qos:error:] + 500
31 Espresso 0x00000001a7e02034 Espresso::ANERuntimeEngine::compiler::build_segment(std::__1::shared_ptr<Espresso::abstract_batch> const&, int, Espresso::net_compiler_segment_based::segment_t const&) + 3736
32 Espresso 0x00000001a7e010cc Espresso::net_compiler_segment_based::build(std::__1::shared_ptr<Espresso::abstract_batch> const&, int, int) + 384
33 Espresso 0x00000001a7df02a4 Espresso::ANERuntimeEngine::compiler::build(std::__1::shared_ptr<Espresso::abstract_batch> const&, int, int) + 120
34 Espresso 0x00000001a7e1b3a4 Espresso::net::__build(std::__1::shared_ptr<Espresso::abstract_batch> const&, int, int) + 360
35 Espresso 0x00000001a7e178e0 Espresso::abstract_context::compute_batch_sync(void (std::__1::shared_ptr<Espresso::abstract_batch> const&) block_pointer) + 112
36 Espresso 0x00000001a7e198b8 EspressoLight::espresso_plan::prepare_compiler_if_needed() + 3208
37 Espresso 0x00000001a7e183f4 EspressoLight::espresso_plan::prepare() + 1712
38 Espresso 0x00000001a7da8e78 espresso_plan_build_with_options + 300
39 Espresso 0x00000001a7da8d30 espresso_plan_build + 44
40 CoreML 0x00000001b346645c -[MLNeuralNetworkEngine rebuildPlan:error:] + 536
41 CoreML 0x00000001b3464294 -[MLNeuralNetworkEngine _setupContextAndPlanWithConfiguration:usingCPU:reshapeWithContainer:error:] + 3132
42 CoreML 0x00000001b34797a0 -[MLNeuralNetworkEngine initWithContainer:configuration:error:] + 196
43 CoreML 0x00000001b347962c +[MLNeuralNetworkEngine loadModelFromCompiledArchive:modelVersionInfo:compilerVersionInfo:configuration:error:] + 164
44 CoreML 0x00000001b34792a0 +[MLLoader _loadModelWithClass:fromArchive:modelVersionInfo:compilerVersionInfo:configuration:error:] + 144
45 CoreML 0x00000001b3478c64 +[MLLoader _loadModelFromArchive:configuration:modelVersion:compilerVersion:loaderEvent:useUpdatableModelLoaders:loadingClasses:error:] + 532
46 CoreML 0x00000001b34650c8 +[MLLoader _loadWithModelLoaderFromArchive:configuration:loaderEvent:useUpdatableModelLoaders:error:] + 424
47 CoreML 0x00000001b3474bc8 +[MLLoader _loadModelFromArchive:configuration:loaderEvent:useUpdatableModelLoaders:error:] + 460
48 CoreML 0x00000001b347a024 +[MLLoader _loadModelFromAssetAtURL:configuration:loaderEvent:error:] + 244
49 CoreML 0x00000001b3479cbc +[MLLoader loadModelFromAssetAtURL:configuration:error:] + 104
50 CoreML 0x00000001b347ac2c -[MLModelAsset load:] + 564
51 CoreML 0x00000001b347a9c4 -[MLModelAsset modelWithError:] + 24
52 CoreML 0x00000001b347a7b4 +[MLModel modelWithContentsOfURL:configuration:error:] + 172
53 CoreML 0x00000001b37afbc4 +[MLModel modelWithContentsOfURL:error:] + 76
Core code
MLModel* model = nil;
NSError *error = nil;
@try
{
model = [MLModel modelWithContentsOfURL:modelURL error:&error];
}
@catch (NSException *exception)
{
model = nil;
return Ret_OperationErr_InvalidInit;
}
Two question:
What does this stack mean?
I added @ try @ catch, why is it still crashing?
Hello everyone,
I have a PyTorch model that outputs an image. I converted this model to CoreML using coremltools, and the resulting CoreML model can be used in my iOS project to perform inference using the MLModel's prediction function, which returns a result of type CVPixelBuffer.
I want to avoid allocating memory every time I call the prediction function. Instead, I would like to use a pre-allocated buffer. I noticed that MLModel provides an overloaded prediction function that accepts an MLPredictionOptions object. This object has an outputBackings member, which allows me to pass a pre-allocated CVPixelBuffer.
However, when I attempt to do this, I encounter the following error:
Copy from tensor to pixel buffer (pixel_format_type: BGRA, image_pixel_type: BGR8, component_dtype: INT, component_pack: FMT_32) is not supported.
Could someone point out what I might be doing wrong? How can I make MLModel use my pre-allocated CVPixelBuffer instead of creating a new one each time?
Here is the Python code I used to convert the PyTorch model to CoreML, where I specified the color_layout as coremltools.colorlayout.BGR:
def export_ml(model, resolution="640x360"):
ml_path = f"model.mlpackage"
print("exporting ml model")
width, height = map(int, resolution.split('x'))
img0 = torch.randn(1, 3, height, width)
img1 = torch.randn(1, 3, height, width)
traced_model = torch.jit.trace(model, (img0, img1))
input_shape = ct.Shape(shape=(1, 3, height, width))
output_type_img = ct.ImageType(name="out", scale=1.0, bias=[0, 0, 0], color_layout=ct.colorlayout.BGR)
ml_model = ct.convert(
traced_model,
inputs=[input_type_img0, input_type_img1],
outputs=[output_type_img]
)
ml_model.save(ml_path)
Here is the Swift code in my iOS project that calls the MLModel's prediction function:
func prediction(image1: CVPixelBuffer, image2: CVPixelBuffer, model: MLModel) -> CVPixelBuffer? {
let options = MLPredictionOptions()
guard let outputBuffer = outputBacking else {
fatalError("Failed to create CVPixelBuffer.")
}
options.outputBackings = ["out": outputBuffer]
// Perform the prediction
guard let prediction = try? model.prediction(from: RifeInput(img0: image1, img1: image2), options: options) else {
Log.i("Failed to perform prediction")
return nil
}
// Extract the result
guard let cvPixelBuffer = prediction.featureValue(for: "out")?.imageBufferValue else {
Log.i("Failed to get results from the model")
return nil
}
return cvPixelBuffer
}
Here is the code I used to create the outputBacking:
let attributes: [String: Any] = [
kCVPixelBufferCGImageCompatibilityKey as String: true,
kCVPixelBufferCGBitmapContextCompatibilityKey as String: true,
kCVPixelBufferWidthKey as String: Int(640),
kCVPixelBufferHeightKey as String: Int(360),
kCVPixelBufferIOSurfacePropertiesKey as String: [:]
]
let status = CVPixelBufferCreate(kCFAllocatorDefault, 640, 360, kCVPixelFormatType_32BGRA, attributes as CFDictionary, &outputBacking)
guard let outputBuffer = outputBacking else {
fatalError("Failed to create CVPixelBuffer.")
}
Any help or guidance would be greatly appreciated!
Thank you!
We have an application that receives a message (through MQTT) from an external system to snap a photo, runs a CoreML vision request on the image, and then sends the results back. The customer has 100s of devices and recently on a couple of those devices (13 pros), the customer encountered an issue in which the devices were not responding in time. There was no crash, just some individual inferences were slowed down. The device performs 1000s of requests per day. Upon further evaluation of the request before and after in the device logs, I noticed that Apple loads the following
default 2024-09-04 13:18:31.310401 -0400 ProcessName Processing image for reference: ***
default 2024-09-04 13:18:31.403606 -0400 ProcessName Found matching service: H1xANELoadBalancer
default 2024-09-04 13:18:31.403646 -0400 ProcessName Found matching service: H11ANEIn
default 2024-09-04 13:18:31.403661 -0400 ProcessName Found ANE device :1
default 2024-09-04 13:18:31.403681 -0400 ProcessName Total num of devices 1
default 2024-09-04 13:18:31.403681 -0400 ProcessName (Single-ANE System) Opening H11ANE device at index 0
default 2024-09-04 13:18:31.403681 -0400 ProcessName H11ANEDevice::H11ANEDeviceOpen, usage type: 1
In a good scenario (above), these actions will performed very quickly (in a split second). The app doesn't do anything until coreml inference result is returned. In the bad scenario (below), there is a delay of about 4 seconds from app passing the control to vision request and then getting the response back (leading to timeouts with the customer)
default 2024-09-04 13:19:08.777468 -0400 ProcessName Processing image for reference: ZZZ
default 2024-09-04 13:19:12.199758 -0400 ProcessName Found matching service: H1xANELoadBalancer
default 2024-09-04 13:19:12.199800 -0400 ProcessName Found matching service: H11ANEIn
default 2024-09-04 13:19:12.199812 -0400 ProcessName Found ANE device :1
default 2024-09-04 13:19:12.199832 -0400 ProcessName Total num of devices 1
default 2024-09-04 13:19:12.199834 -0400 ProcessName (Single-ANE System) Opening H11ANE device at index 0
default 2024-09-04 13:19:12.199834 -0400 ProcessName H11ANEDevice::H11ANEDeviceOpen, usage type: 1
The logs are in order, I haven't removed anything. The code is fairly simple, it's just running a vision request without doing much. Has anyone encountered this before?
When I use CoreML to infer a w8a8 model on iPhone 15 (iOS 18 beta 8), the model uses CPU inference instead of ANE, which results in slower inference speed. The model I am using is from the coremltools documentation, which indicates that on iOS 17, quantized models can run on ANE properly and achieve faster speeds. How can I make the quantized model run correctly on ANE to achieve the desired inference speed?
To reproduce this issue, you can download the Weight & Activation quantized model from the following link: https://apple.github.io/coremltools/docs-guides/source/opt-quantization-perf.html.
I was watching wwdc2024 Deploy machine learning and AI models on-device with Core ML (https://developer.apple.com/videos/play/wwdc2024/10161/) and speaker was showing UI interface where he was ruining on device LLMs / Foundation models. I was wondering if this UI interface is open source and I can download and play around with similar app what was shown:
com.apple.Vision Code=9 "Could not build inference plan - ANECF error: failed to load ANE model file:///System/Library/Frameworks/ Vision.framework/anodv4_drop6_fp16.H14G.espresso.hwx
Code rise this error:
func imageToHeadBox(image: CVPixelBuffer) async throws -> [CGRect] {
let request:DetectFaceRectanglesRequest = DetectFaceRectanglesRequest()
let faceResult:[FaceObservation] = try await request.perform(on: image)
let faceBoxs:[CGRect] = faceResult.map { face in
let faceBoundingBox:CGRect = face.boundingBox.cgRect
return faceBoundingBox
}
return faceBoxs
}
Recently, deep learning projects have been getting larger, and sometimes loading models has become a bottleneck. I download the .mlpackage format CoreML from the internet and need to use compileModelAtURL to convert the .mlpackage into an .mlmodelc, then call modelWithContentsOfURL to convert the .mlmodelc into a handle. Generally, generating a handle with modelWithContentsOfURL is very slow. I noticed from WWDC 2023 that it is possible to cache the compiled results (see https://developer.apple.com/videos/play/wwdc2023/10049/?time=677, which states "This compilation includes further optimizations for the specific compute device and outputs an artifact that the compute device can run. Once complete, Core ML caches these artifacts to be used for subsequent model loads."). However, it seems that I couldn't find how to cache in the documentation.
func testMLTensor() {
let t1 = MLTensor(shape: [2000, 1], scalars: [Float](repeating: Float.random(in: 0.0...10.0), count: 2000), scalarType: Float.self)
let t2 = MLTensor(shape: [1, 3000], scalars: [Float](repeating: Float.random(in: 0.0...10.0), count: 3000), scalarType: Float.self)
for _ in 0...50 {
let t = Date()
let x = (t1 * t2)
print("MLTensor", t.timeIntervalSinceNow * 1000, "ms")
}
}
testMLTensor()
The above code took more time than expected, especially in the early stage of iteration.
func testMLTensor() {
let t1 = MLTensor(shape: [2000, 1], scalars: [Float](repeating: Float.random(in: 0.0...10.0), count: 2000), scalarType: Float.self)
let t2 = MLTensor(shape: [1, 3000], scalars: [Float](repeating: Float.random(in: 0.0...10.0), count: 3000), scalarType: Float.self)
for _ in 0...50 {
let t = Date()
let x = (t1 * t2)
print("MLTensor", t.timeIntervalSinceNow * 1000, "ms")
}
}
testMLTensor()
The above code took more time than expected, especially in the early stage of iteration.
func testMLTensor() {
let t1 = MLTensor(shape: [2000, 1], scalars: [Float](repeating: Float.random(in: 0.0...10.0), count: 2000), scalarType: Float.self)
let t2 = MLTensor(shape: [1, 3000], scalars: [Float](repeating: Float.random(in: 0.0...10.0), count: 3000), scalarType: Float.self)
for _ in 0...50 {
let t = Date()
let x = (t1 * t2)
print("MLTensor", t.timeIntervalSinceNow * 1000, "ms")
}
}
testMLTensor()
The above code took more time than expected, especially in the early stage of iteration.
Hello,
My App works well on iOS17 and previous iOS18 Beta version, while it crashes on latest iOS18 Beta5, when it calling model predictionFromFeatures.
Calling stack of crash is as:
*** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: 'Unrecognized ANE execution priority MLANEExecutionPriority_Unspecified'
Last Exception Backtrace:
0 CoreFoundation 0x000000019bd6408c __exceptionPreprocess + 164
1 libobjc.A.dylib 0x000000019906b2e4 objc_exception_throw + 88
2 CoreFoundation 0x000000019be5f648 -[NSException initWithCoder:]
3 CoreML 0x00000001b7507340 -[MLE5ExecutionStream _setANEExecutionPriorityWithOptions:] + 248
4 CoreML 0x00000001b7508374 -[MLE5ExecutionStream _prepareForInputFeatures:options:error:] + 248
5 CoreML 0x00000001b7507ddc -[MLE5ExecutionStream executeForInputFeatures:options:error:] + 68
6 CoreML 0x00000001b74ce5c4 -[MLE5Engine _predictionFromFeatures:stream:options:error:] + 80
7 CoreML 0x00000001b74ce7fc -[MLE5Engine _predictionFromFeatures:options:error:] + 208
8 CoreML 0x00000001b74cf110 -[MLE5Engine _predictionFromFeatures:usingState:options:error:] + 400
9 CoreML 0x00000001b74cf270 -[MLE5Engine predictionFromFeatures:options:error:] + 96
10 CoreML 0x00000001b74ab264 -[MLDelegateModel _predictionFromFeatures:usingState:options:error:] + 684
11 CoreML 0x00000001b70991bc -[MLDelegateModel predictionFromFeatures:options:error:] + 124
And my model file type is ml package file. Source code is as below:
//model
MLModel *_model;
......
// model init
MLModelConfiguration* config = [[MLModelConfiguration alloc]init];
config.computeUnits = MLComputeUnitsCPUAndNeuralEngine;
_model = [MLModel modelWithContentsOfURL:compileUrl configuration:config error:&error];
.....
// model prediction
MLPredictionOptions *option = [[MLPredictionOptions alloc]init];
id<MLFeatureProvider> outFeatures = [_model predictionFromFeatures:_modelInput options:option error:&error];
Is there anything wrong? Any advice would be appreciated.
Hi, i have been noticing some strange issues with using CoreML models in my app. I am using the Whisper.cpp implementation which has a coreML option. This speeds up the transcribing vs Metal.
However every time i use it, the app size inside iphone settings -> General -> Storage increases - specifically the "documents and data" part, the bundle size stays consistent. The Size of the app seems to increase by the same size of the coreml model, and after a few reloads it can increase to over 3-4gb!
I thought that maybe the coreml model (which is in the bundle) is being saved to file - but i can't see where, i have tried to use instruments and xcode plus lots of printing out of cache and temp directory etc, deleting the caches etc.. but no effect.
I have downloaded the container of the iphone from xcode and inspected it, there are some files stored inthe cache but only a few kbs, and even though the value in the settings-> storage shows a few gb, the container is only a few mb.
Please can someone help or give me some guidance on what to do to figure out why the documents and data is increasing? where could this folder be pointing to that is not in the xcode downloaded container??
This is the repo i am using https://github.com/ggerganov/whisper.cpp the swiftui app and objective-C app both do the same thing i am witnessing when using coreml.
Thanks in advance for any help, i am totally baffled by this behaviour
I'm trying to cast the error thrown by TranslationSession.translations(from:) as Translation.TranslationError. However, the app crashes at runtime whenever Translation.TranslationError is used in the project.
Environment:
iOS Version: 18.1 beta
Xcode Version: 16 beta
yld[14615]: Symbol not found: _$s11Translation0A5ErrorVMa
Referenced from: <3426152D-A738-30C1-8F06-47D2C6A1B75B> /private/var/containers/Bundle/Application/043A25BC-E53E-4B28-B71A-C21F77C0D76D/TranslationAPI.app/TranslationAPI.debug.dylib
Expected in: /System/Library/Frameworks/Translation.framework/Translation
I wanted to deploy some ViT models on an iPhone. I referred to https://machinelearning.apple.com/research/vision-transformers for deployment and wrote a simple demo based on the code from https://github.com/apple/ml-vision-transformers-ane. However, I found that the uncached load time on the phone is very long. According to the blog, the input is already aligned to 64 bytes, but the speed is still very slow. Is there any way to speed it up? This is my test case:
import torch
import coremltools as ct
import math
from torch import nn
class SelfAttn(torch.nn.Module):
def __init__(self, window_size, num_heads, dim, dim_out):
super().__init__()
self.window_size = window_size
self.num_heads = num_heads
self.dim = dim
self.dim_out = dim_out
self.q_proj = nn.Conv2d(
in_channels=dim,
out_channels=dim_out,
kernel_size=1,
)
self.k_proj = nn.Conv2d(
in_channels=dim,
out_channels=dim_out,
kernel_size=1,
)
self.v_proj = nn.Conv2d(
in_channels=dim,
out_channels=dim_out,
kernel_size=1,
)
def forward(self, x):
B, HW, C = x.shape
image_shape = (B, C, self.window_size, self.window_size)
x_2d = x.permute((0, 2, 1)).reshape(image_shape) # BCHW
x_flat = torch.unsqueeze(x.permute((0, 2, 1)), 2) # BC1L
q, k, v_2d = self.q_proj(x_flat), self.k_proj(x_flat), self.v_proj(x_2d)
mh_q = torch.split(q, self.dim_out // self.num_heads, dim=1) # BC1L
mh_v = torch.split(
v_2d.reshape(B, -1, x_flat.shape[2], x_flat.shape[3]), self.dim_out // self.num_heads, dim=1
)
mh_k = torch.split(
torch.permute(k, (0, 3, 2, 1)), self.dim_out // self.num_heads, dim=3
)
scale_factor = 1 / math.sqrt(mh_q[0].size(1))
attn_weights = [
torch.einsum("bchq, bkhc->bkhq", qi, ki) * scale_factor
for qi, ki in zip(mh_q, mh_k)
]
attn_weights = [
torch.softmax(aw, dim=1) for aw in attn_weights
] # softmax applied on channel "C"
mh_x = [torch.einsum("bkhq,bchk->bchq", wi, vi) for wi, vi in zip(attn_weights, mh_v)]
x = torch.cat(mh_x, dim=1)
return x
window_size = 8
path_batch = 1024
emb_dim = 96
emb_dim_out = 96
x = torch.rand(path_batch, window_size * window_size, emb_dim)
qkv_layer = SelfAttn(window_size, 1, emb_dim, emb_dim_out)
jit = torch.jit.trace(qkv_layer, (x))
mlmod_fixed_shape = ct.convert(
jit,
inputs=[
ct.TensorType("x", x.shape),
],
convert_to="mlprogram",
)
mlmodel_path = "test_ane.mlpackage"
mlmod_fixed_shape.save(mlmodel_path)
The uncached load took nearly 36 seconds, and it was just a single matrix multiplication.
In macOS 15 beta the gridsample function from PyTorch is not executing as expected on the Apple Neural Engine in MacBook Pro M2.
Please find below a Python code snippet that demonstrates the problem:
import coremltools as ct
import torch.nn as nn
import torch.nn.functional as F
class PytorchGridSample(torch.nn.Module):
def __init__(self, grids):
super(PytorchGridSample, self).__init__()
self.upsample1 = nn.ConvTranspose2d(512, 256, kernel_size=4, stride=2, padding=1)
self.upsample2 = nn.ConvTranspose2d(256, 128, kernel_size=4, stride=2, padding=1)
self.upsample3 = nn.ConvTranspose2d(128, 64, kernel_size=4, stride=2, padding=1)
self.upsample4 = nn.ConvTranspose2d(64, 32, kernel_size=4, stride=2, padding=1)
self.upsample5 = nn.ConvTranspose2d(32, 3, kernel_size=4, stride=2, padding=1)
self.grids = grids
def forward(self, x):
x = self.upsample1(x)
x = F.grid_sample(x, self.grids[0], padding_mode='reflection', align_corners=False)
x = self.upsample2(x)
x = F.grid_sample(x, self.grids[1], padding_mode='reflection', align_corners=False)
x = self.upsample3(x)
x = F.grid_sample(x, self.grids[2], padding_mode='reflection', align_corners=False)
x = self.upsample4(x)
x = F.grid_sample(x, self.grids[3], padding_mode='reflection', align_corners=False)
x = self.upsample5(x)
x = F.grid_sample(x, self.grids[4], padding_mode='reflection', align_corners=False)
return x
def convert_to_coreml(model, input_):
traced_model = torch.jit.trace(model, example_inputs=input_, strict=False)
coreml_model = ct.converters.convert(
traced_model,
inputs=[ct.TensorType(shape=input_.shape)],
compute_precision=ct.precision.FLOAT16,
minimum_deployment_target=ct.target.macOS14,
compute_units=ct.ComputeUnit.ALL
)
return coreml_model
def main(pt_model, input_):
coreml_model = convert_to_coreml(pt_model, input_)
coreml_model.save("grid_sample.mlpackage")
if __name__ == "__main__":
input_tensor = torch.randn(1, 512, 4, 4)
grids = [torch.randn(1, 2*i, 2*i, 2) for i in [4, 8, 16, 32, 64, 128]]
pt_model = PytorchGridSample(grids)
main(pt_model, input_tensor)
After I upgraded to MacOS 15 Beta 4(M1 16G), the sampling speed of apple ml-stable-diffusion was about 40% slower than MacOS 14.
And when I recompile and run with xcode 16, the following error will appear:
loc("EpicPhoto/Unet.mlmodelc/model.mil":2748:12): error: invalid axis: 4294967296, axis must be in range -|rank| <= axis < |rank|
Assertion failed: (0 && "failed to infer output types"), function _inferJITOutputTypes, file GPUBaseOps.mm, line 339.
I checked the macos 15 release notes and saw that the problem of slow running of Core ML models was fixed, but it didn't seem to be fixed.
Fixed: Inference time for large Core ML models is slower than expected on a subset of M-series SOCs (e.g. M1, M1 max) on macOS. (129682801)