Apple Developer Forums

The CoreML MultiArray Float16 input is not supported for running on the NPU, and this issue only occurs on the iPhone 11.

Xcode Version: Version 15.2 (15C500b) com.github.apple.coremltools.source: torch==1.12.1 com.github.apple.coremltools.version: 7.2 Compute: Mixed (Float16, Int32) Storage: Float16 The input to the mlpackage is MultiArray (Float16 1 × 1 × 544 × 960) The flexibility is: 1 × 1 × 544 × 960 | 1 × 1 × 384 × 640 | 1 × 1 × 736 × 1280 | 1 × 1 × 1088 × 1920 I tested this on iPhone XR, iPhone 11, iPhone 12, iPhone 13, and iPhone 14. On all devices except the iPhone 11, the model runs correctly on the NPU. However, on the iPhone 11, the model runs on the CPU instead. Here is the CoreMLTools conversion code I used: mlmodel = ct.convert(trace, inputs=[ct.TensorType(shape=input_shape, name="input", dtype=np.float16)], outputs=[ct.TensorType(name="output", dtype=np.float16, shape=output_shape)], convert_to='mlprogram', minimum_deployment_target=ct.target.iOS16 )

Machine Learning & AI Core ML iPhone iOS Core ML

3

0

500

Sep ’24

Core ML Models

I want my confidence of model is worked according to the when I detected the object by real time camera with help of ml model in android its gives me different results with different confidence as like 75, 40,30,95 not range 95 to 100 but when I used same model in ios its will give me range above 95 of any case. so what will be reason do you think

Machine Learning & AI Core ML

0

319

Sep ’24

CoreML crash on macOS 15.0 (24A335)

When I try to run basically any CoreML model using MLPredictionOptions.outputBackings , inference throws the following error: 2024-09-11 15:36:00.184740-0600 run_demo[4260:64822] [coreml] Unrecognized ANE execution priority (null) 2024-09-11 15:36:00.185380-0600 run_demo[4260:64822] *** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: 'Unrecognized ANE execution priority (null)' *** First throw call stack: ( 0 CoreFoundation 0x000000019812cec0 __exceptionPreprocess + 176 1 libobjc.A.dylib 0x0000000197c12cd8 objc_exception_throw + 88 2 CoreFoundation 0x000000019812cdb0 +[NSException exceptionWithName:reason:userInfo:] + 0 3 CoreML 0x00000001a1bf6504 _ZN12_GLOBAL__N_141espressoPlanPriorityFromPredictionOptionsEP19MLPredictionOptions + 264 4 CoreML 0x00000001a1bf68c0 -[MLNeuralNetworkEngine _matchEngineToOptions:error:] + 236 5 CoreML 0x00000001a1be254c __62-[MLNeuralNetworkEngine predictionFromFeatures:options:error:]_block_invoke + 68 6 libdispatch.dylib 0x0000000197e20658 _dispatch_client_callout + 20 7 libdispatch.dylib 0x0000000197e2fcd8 _dispatch_l *** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: 'Unrecognized ANE execution priority (null)' *** First throw call stack: ( 0 CoreFoundation 0x000000019812cec0 __exceptionPreprocess + 176 1 libobjc.A.dylib 0x0000000197c12cd8 objc_exception_throw + 88 2 CoreFoundation 0x000000019812cdb0 +[NSException exceptionWithName:reason:userInfo:] + 0 3 CoreML 0x00000001a1bf6504 _ZN12_GLOBAL__N_141espressoPlanPriorityFromPredictionOptionsEP19MLPredictionOptions + 264 4 CoreML 0x00000001a1bf68c0 -[MLNeuralNetworkEngine _matchEngineToOptions:error:] + 236 5 CoreML 0x00000001a1be254c __62-[MLNeuralNetworkEngine predictionFromFeatures:options:error:]_block_invoke + 68 6 libdispatch.dylib 0x0000000197e20658 _dispatch_client_callout + 20 7 libdispatch.dylib 0x0000000197e2fcd8 _dispatch_lane_barrier_sync_invoke_and_complete + 56 8 CoreML 0x00000001a1be2450 -[MLNeuralNetworkEngine predictionFromFeatures:options:error:] + 304 9 CoreML 0x00000001a1c9e118 -[MLDelegateModel _predictionFromFeatures:usingState:options:error:] + 776 10 CoreML 0x00000001a1c9e4a4 -[MLDelegateModel predictionFromFeatures:options:error:] + 136 11 libMLBackend_coreml.dylib 0x00000001002f19f0 _ZN6CoreML8runModelENS_5ModelERNSt3__16vectorIPvNS1_9allocatorIS3_EEEES7_ + 904 12 libMLBackend_coreml.dylib 0x00000001002c56e8 _ZZN8ModelImp9runCoremlEPN2ML7Backend17ModelIoBindingImpEENKUlvE_clEv + 120 13 libMLBackend_coreml.dylib 0x00000001002c1e40 _ZNSt3__110__function6__funcIZN2ML4Util10WorkThread11runInThreadENS_8functionIFvvEEEEUlvE_NS_9allocatorIS8_EES6_EclEv + 40 14 libMLBackend_coreml.dylib 0x00000001002bc3a4 _ZZN2ML4Util10WorkThreadC1EvENKUlvE_clEv + 160 15 libMLBackend_coreml.dylib 0x00000001002bc244 _ZNSt3__114__thread_proxyB7v160006INS_5tupleIJNS_10unique_ptrINS_15__thread_structENS_14default_deleteIS3_EEEEZN2ML4Util10WorkThreadC1EvEUlvE_EEEEEPvSC_ + 52 16 libsystem_pthread.dylib 0x0000000197fd32e4 _pthread_start + 136 17 libsystem_pthread.dylib 0x0000000197fce0fc thread_start + 8 ) libc++abi: terminating due to uncaught exception of type NSException Interestingly, if I don't use MLPredictionOptions to set pre-allocated output backings, then inference appears to run as expected. A similar issue seems to have been discussed and fixed here: https://developer.apple.com/forums/thread/761649 , however I'm seeing this issue on a beta build that I downloaded today (Sept 11 2024). Will this be fixed? Any advice would be greatly appreciated. Thanks

Machine Learning & AI Core ML Beta macOS Core ML

2

0

705

Sep ’24

how speed up modelWithContentsOfURL？

Recently, deep learning model have been getting larger, and sometimes loading models has become a bottleneck. I download the .mlpackage format CoreML from the internet and need to use compileModelAtURL to convert the .mlpackage into an .mlmodelc, then call modelWithContentsOfURL to convert the .mlmodelc into a handle. Generally, generating a handle with modelWithContentsOfURL is very slow. I noticed from WWDC 2023 that it is possible to cache the compiled results (see https://developer.apple.com/videos/play/wwdc2023/10049/?time=677, which states "This compilation includes further optimizations for the specific compute device and outputs an artifact that the compute device can run. Once complete, Core ML caches these artifacts to be used for subsequent model loads."). However, it seems that I couldn't find how to cache in the documentation.

Machine Learning & AI Core ML

1

0

337

Aug ’24

crash when modelWithContentsOfURL in iOS 16+

We have a code that crashed The crash stack is as follows Thread 26 Crashed: 0 CoreFoundation 0x0000000198b0569c CFRelease + 44 1 CoreFoundation 0x0000000198b12334 __CFBasicHashRehash + 1172 2 CoreFoundation 0x0000000198b015dc __CFBasicHashAddValue + 100 3 CoreFoundation 0x0000000198b232e4 CFDictionarySetValue + 208 4 Foundation 0x00000001979b0378 _getStringAtMarker + 464 5 Foundation 0x00000001979b016c _NSXPCSerializationStringForObject + 56 6 Foundation 0x00000001979cec4c __44-[NSXPCDecoder _decodeArrayOfObjectsForKey:]_block_invoke + 52 7 Foundation 0x00000001979ceb90 _NSXPCSerializationIterateArrayObject + 208 8 Foundation 0x00000001979cda7c -[NSXPCDecoder _decodeArrayOfObjectsForKey:] + 240 9 Foundation 0x00000001979cd1bc -[NSDictionary(NSDictionary) initWithCoder:] + 176 10 Foundation 0x00000001979ae6e8 _decodeObject + 1264 11 Foundation 0x00000001979cec4c __44-[NSXPCDecoder _decodeArrayOfObjectsForKey:]_block_invoke + 52 12 Foundation 0x00000001979ceb90 _NSXPCSerializationIterateArrayObject + 208 13 Foundation 0x00000001979cda7c -[NSXPCDecoder _decodeArrayOfObjectsForKey:] + 240 14 Foundation 0x00000001979cd1a4 -[NSDictionary(NSDictionary) initWithCoder:] + 152 15 Foundation 0x00000001979ae6e8 _decodeObject + 1264 16 Foundation 0x00000001979ad030 -[NSXPCDecoder _decodeObjectOfClasses:atObject:] + 148 17 Foundation 0x0000000197a0a7f0 _NSXPCSerializationDecodeTypedObjCValuesFromArray + 892 18 Foundation 0x0000000197a0a1f8 _NSXPCSerializationDecodeInvocationArgumentArray + 412 19 Foundation 0x0000000197a0866c -[NSXPCDecoder __decodeXPCObject:allowingSimpleMessageSend:outInvocation:outArguments:outArgumentsMaxCount:outMethodSignature:outSelector:isReply:replySelector:] + 700 20 Foundation 0x0000000197a61078 -[NSXPCDecoder _decodeReplyFromXPCObject:forSelector:] + 76 21 Foundation 0x0000000197a5f690 -[NSXPCConnection _decodeAndInvokeReplyBlockWithEvent:sequence:replyInfo:] + 252 22 Foundation 0x0000000197a63664 __88-[NSXPCConnection _sendInvocation:orArguments:count:methodSignature:selector:withProxy:]_block_invoke_5 + 188 23 Foundation 0x0000000197a08058 -[NSXPCConnection _sendInvocation:orArguments:count:methodSignature:selector:withProxy:] + 2244 24 CoreFoundation 0x0000000198b19d88 ___forwarding___ + 1016 25 CoreFoundation 0x0000000198b198d0 _CF_forwarding_prep_0 + 96 26 AppleNeuralEngine 0x00000001e912ab1c -[_ANEDaemonConnection loadModel:sandboxExtension:options:qos:withReply:] + 332 27 AppleNeuralEngine 0x00000001e912a674 __44-[_ANEClient doLoadModel:options:qos:error:]_block_invoke + 360 28 libdispatch.dylib 0x00000001a0a21dd4 _dispatch_client_callout + 20 29 libdispatch.dylib 0x00000001a0a312c4 _dispatch_lane_barrier_sync_invoke_and_complete + 56 30 AppleNeuralEngine 0x00000001e9129ef0 -[_ANEClient doLoadModel:options:qos:error:] + 500 31 Espresso 0x00000001a7e02034 Espresso::ANERuntimeEngine::compiler::build_segment(std::__1::shared_ptr<Espresso::abstract_batch> const&, int, Espresso::net_compiler_segment_based::segment_t const&) + 3736 32 Espresso 0x00000001a7e010cc Espresso::net_compiler_segment_based::build(std::__1::shared_ptr<Espresso::abstract_batch> const&, int, int) + 384 33 Espresso 0x00000001a7df02a4 Espresso::ANERuntimeEngine::compiler::build(std::__1::shared_ptr<Espresso::abstract_batch> const&, int, int) + 120 34 Espresso 0x00000001a7e1b3a4 Espresso::net::__build(std::__1::shared_ptr<Espresso::abstract_batch> const&, int, int) + 360 35 Espresso 0x00000001a7e178e0 Espresso::abstract_context::compute_batch_sync(void (std::__1::shared_ptr<Espresso::abstract_batch> const&) block_pointer) + 112 36 Espresso 0x00000001a7e198b8 EspressoLight::espresso_plan::prepare_compiler_if_needed() + 3208 37 Espresso 0x00000001a7e183f4 EspressoLight::espresso_plan::prepare() + 1712 38 Espresso 0x00000001a7da8e78 espresso_plan_build_with_options + 300 39 Espresso 0x00000001a7da8d30 espresso_plan_build + 44 40 CoreML 0x00000001b346645c -[MLNeuralNetworkEngine rebuildPlan:error:] + 536 41 CoreML 0x00000001b3464294 -[MLNeuralNetworkEngine _setupContextAndPlanWithConfiguration:usingCPU:reshapeWithContainer:error:] + 3132 42 CoreML 0x00000001b34797a0 -[MLNeuralNetworkEngine initWithContainer:configuration:error:] + 196 43 CoreML 0x00000001b347962c +[MLNeuralNetworkEngine loadModelFromCompiledArchive:modelVersionInfo:compilerVersionInfo:configuration:error:] + 164 44 CoreML 0x00000001b34792a0 +[MLLoader _loadModelWithClass:fromArchive:modelVersionInfo:compilerVersionInfo:configuration:error:] + 144 45 CoreML 0x00000001b3478c64 +[MLLoader _loadModelFromArchive:configuration:modelVersion:compilerVersion:loaderEvent:useUpdatableModelLoaders:loadingClasses:error:] + 532 46 CoreML 0x00000001b34650c8 +[MLLoader _loadWithModelLoaderFromArchive:configuration:loaderEvent:useUpdatableModelLoaders:error:] + 424 47 CoreML 0x00000001b3474bc8 +[MLLoader _loadModelFromArchive:configuration:loaderEvent:useUpdatableModelLoaders:error:] + 460 48 CoreML 0x00000001b347a024 +[MLLoader _loadModelFromAssetAtURL:configuration:loaderEvent:error:] + 244 49 CoreML 0x00000001b3479cbc +[MLLoader loadModelFromAssetAtURL:configuration:error:] + 104 50 CoreML 0x00000001b347ac2c -[MLModelAsset load:] + 564 51 CoreML 0x00000001b347a9c4 -[MLModelAsset modelWithError:] + 24 52 CoreML 0x00000001b347a7b4 +[MLModel modelWithContentsOfURL:configuration:error:] + 172 53 CoreML 0x00000001b37afbc4 +[MLModel modelWithContentsOfURL:error:] + 76 Core code MLModel* model = nil; NSError *error = nil; @try { model = [MLModel modelWithContentsOfURL:modelURL error:&error]; } @catch (NSException *exception) { model = nil; return Ret_OperationErr_InvalidInit; } Two question: What does this stack mean? I added @ try @ catch, why is it still crashing?

Machine Learning & AI Core ML

1

0

393

Sep ’24

Issue with Using Pre-Allocated CVPixelBuffer for CoreML Model Prediction

Hello everyone, I have a PyTorch model that outputs an image. I converted this model to CoreML using coremltools, and the resulting CoreML model can be used in my iOS project to perform inference using the MLModel's prediction function, which returns a result of type CVPixelBuffer. I want to avoid allocating memory every time I call the prediction function. Instead, I would like to use a pre-allocated buffer. I noticed that MLModel provides an overloaded prediction function that accepts an MLPredictionOptions object. This object has an outputBackings member, which allows me to pass a pre-allocated CVPixelBuffer. However, when I attempt to do this, I encounter the following error: Copy from tensor to pixel buffer (pixel_format_type: BGRA, image_pixel_type: BGR8, component_dtype: INT, component_pack: FMT_32) is not supported. Could someone point out what I might be doing wrong? How can I make MLModel use my pre-allocated CVPixelBuffer instead of creating a new one each time? Here is the Python code I used to convert the PyTorch model to CoreML, where I specified the color_layout as coremltools.colorlayout.BGR: def export_ml(model, resolution="640x360"): ml_path = f"model.mlpackage" print("exporting ml model") width, height = map(int, resolution.split('x')) img0 = torch.randn(1, 3, height, width) img1 = torch.randn(1, 3, height, width) traced_model = torch.jit.trace(model, (img0, img1)) input_shape = ct.Shape(shape=(1, 3, height, width)) output_type_img = ct.ImageType(name="out", scale=1.0, bias=[0, 0, 0], color_layout=ct.colorlayout.BGR) ml_model = ct.convert( traced_model, inputs=[input_type_img0, input_type_img1], outputs=[output_type_img] ) ml_model.save(ml_path) Here is the Swift code in my iOS project that calls the MLModel's prediction function: func prediction(image1: CVPixelBuffer, image2: CVPixelBuffer, model: MLModel) -> CVPixelBuffer? { let options = MLPredictionOptions() guard let outputBuffer = outputBacking else { fatalError("Failed to create CVPixelBuffer.") } options.outputBackings = ["out": outputBuffer] // Perform the prediction guard let prediction = try? model.prediction(from: RifeInput(img0: image1, img1: image2), options: options) else { Log.i("Failed to perform prediction") return nil } // Extract the result guard let cvPixelBuffer = prediction.featureValue(for: "out")?.imageBufferValue else { Log.i("Failed to get results from the model") return nil } return cvPixelBuffer } Here is the code I used to create the outputBacking: let attributes: [String: Any] = [ kCVPixelBufferCGImageCompatibilityKey as String: true, kCVPixelBufferCGBitmapContextCompatibilityKey as String: true, kCVPixelBufferWidthKey as String: Int(640), kCVPixelBufferHeightKey as String: Int(360), kCVPixelBufferIOSurfacePropertiesKey as String: [:] ] let status = CVPixelBufferCreate(kCFAllocatorDefault, 640, 360, kCVPixelFormatType_32BGRA, attributes as CFDictionary, &outputBacking) guard let outputBuffer = outputBacking else { fatalError("Failed to create CVPixelBuffer.") } Any help or guidance would be greatly appreciated! Thank you!

Machine Learning & AI Core ML

1

0

400

Sep ’24

H1xANELoadBalancer is taking longer to load

We have an application that receives a message (through MQTT) from an external system to snap a photo, runs a CoreML vision request on the image, and then sends the results back. The customer has 100s of devices and recently on a couple of those devices (13 pros), the customer encountered an issue in which the devices were not responding in time. There was no crash, just some individual inferences were slowed down. The device performs 1000s of requests per day. Upon further evaluation of the request before and after in the device logs, I noticed that Apple loads the following default 2024-09-04 13:18:31.310401 -0400 ProcessName Processing image for reference: *** default 2024-09-04 13:18:31.403606 -0400 ProcessName Found matching service: H1xANELoadBalancer default 2024-09-04 13:18:31.403646 -0400 ProcessName Found matching service: H11ANEIn default 2024-09-04 13:18:31.403661 -0400 ProcessName Found ANE device :1 default 2024-09-04 13:18:31.403681 -0400 ProcessName Total num of devices 1 default 2024-09-04 13:18:31.403681 -0400 ProcessName (Single-ANE System) Opening H11ANE device at index 0 default 2024-09-04 13:18:31.403681 -0400 ProcessName H11ANEDevice::H11ANEDeviceOpen, usage type: 1 In a good scenario (above), these actions will performed very quickly (in a split second). The app doesn't do anything until coreml inference result is returned. In the bad scenario (below), there is a delay of about 4 seconds from app passing the control to vision request and then getting the response back (leading to timeouts with the customer) default 2024-09-04 13:19:08.777468 -0400 ProcessName Processing image for reference: ZZZ default 2024-09-04 13:19:12.199758 -0400 ProcessName Found matching service: H1xANELoadBalancer default 2024-09-04 13:19:12.199800 -0400 ProcessName Found matching service: H11ANEIn default 2024-09-04 13:19:12.199812 -0400 ProcessName Found ANE device :1 default 2024-09-04 13:19:12.199832 -0400 ProcessName Total num of devices 1 default 2024-09-04 13:19:12.199834 -0400 ProcessName (Single-ANE System) Opening H11ANE device at index 0 default 2024-09-04 13:19:12.199834 -0400 ProcessName H11ANEDevice::H11ANEDeviceOpen, usage type: 1 The logs are in order, I haven't removed anything. The code is fairly simple, it's just running a vision request without doing much. Has anyone encountered this before?

Machine Learning & AI Core ML

0

1

331

Sep ’24

How to Ensure Quantized Models Run on ANE on iPhone 15 (iOS 18 Beta 8)

When I use CoreML to infer a w8a8 model on iPhone 15 (iOS 18 beta 8), the model uses CPU inference instead of ANE, which results in slower inference speed. The model I am using is from the coremltools documentation, which indicates that on iOS 17, quantized models can run on ANE properly and achieve faster speeds. How can I make the quantized model run correctly on ANE to achieve the desired inference speed? To reproduce this issue, you can download the Weight & Activation quantized model from the following link: https://apple.github.io/coremltools/docs-guides/source/opt-quantization-perf.html.

Machine Learning & AI Core ML iPhone iOS

0

402

Sep ’24

UI interface for on device LLMs / Foundation models

I was watching wwdc2024 Deploy machine learning and AI models on-device with Core ML (https://developer.apple.com/videos/play/wwdc2024/10161/) and speaker was showing UI interface where he was ruining on device LLMs / Foundation models. I was wondering if this UI interface is open source and I can download and play around with similar app what was shown:

Machine Learning & AI Core ML

2

1

461

Aug ’24

Vision framework not working on Apple Vision Pro

com.apple.Vision Code=9 "Could not build inference plan - ANECF error: failed to load ANE model file:///System/Library/Frameworks/ Vision.framework/anodv4_drop6_fp16.H14G.espresso.hwx Code rise this error: func imageToHeadBox(image: CVPixelBuffer) async throws -> [CGRect] { let request:DetectFaceRectanglesRequest = DetectFaceRectanglesRequest() let faceResult:[FaceObservation] = try await request.perform(on: image) let faceBoxs:[CGRect] = faceResult.map { face in let faceBoundingBox:CGRect = face.boundingBox.cgRect return faceBoundingBox } return faceBoxs }

Machine Learning & AI Core ML Vision visionOS

1

0

535

Aug ’24

how speed up modelWithContentsOfURL function?

Recently, deep learning projects have been getting larger, and sometimes loading models has become a bottleneck. I download the .mlpackage format CoreML from the internet and need to use compileModelAtURL to convert the .mlpackage into an .mlmodelc, then call modelWithContentsOfURL to convert the .mlmodelc into a handle. Generally, generating a handle with modelWithContentsOfURL is very slow. I noticed from WWDC 2023 that it is possible to cache the compiled results (see https://developer.apple.com/videos/play/wwdc2023/10049/?time=677, which states "This compilation includes further optimizations for the specific compute device and outputs an artifact that the compute device can run. Once complete, Core ML caches these artifacts to be used for subsequent model loads."). However, it seems that I couldn't find how to cache in the documentation.

Machine Learning & AI Core ML

1

0

439

Aug ’24

MLTensor computation took more time than expected.

func testMLTensor() { let t1 = MLTensor(shape: [2000, 1], scalars: [Float](repeating: Float.random(in: 0.0...10.0), count: 2000), scalarType: Float.self) let t2 = MLTensor(shape: [1, 3000], scalars: [Float](repeating: Float.random(in: 0.0...10.0), count: 3000), scalarType: Float.self) for _ in 0...50 { let t = Date() let x = (t1 * t2) print("MLTensor", t.timeIntervalSinceNow * 1000, "ms") } } testMLTensor() The above code took more time than expected, especially in the early stage of iteration.

Machine Learning & AI Core ML ML Compute Accelerate Performance Core ML

1

0

450

Aug ’24

MLTensor computation took more time than expected.

func testMLTensor() { let t1 = MLTensor(shape: [2000, 1], scalars: [Float](repeating: Float.random(in: 0.0...10.0), count: 2000), scalarType: Float.self) let t2 = MLTensor(shape: [1, 3000], scalars: [Float](repeating: Float.random(in: 0.0...10.0), count: 3000), scalarType: Float.self) for _ in 0...50 { let t = Date() let x = (t1 * t2) print("MLTensor", t.timeIntervalSinceNow * 1000, "ms") } } testMLTensor() The above code took more time than expected, especially in the early stage of iteration.

Machine Learning & AI Core ML ML Compute Accelerate Core ML

0

339

Aug ’24

MLTensor computation took more time than expected.

func testMLTensor() { let t1 = MLTensor(shape: [2000, 1], scalars: [Float](repeating: Float.random(in: 0.0...10.0), count: 2000), scalarType: Float.self) let t2 = MLTensor(shape: [1, 3000], scalars: [Float](repeating: Float.random(in: 0.0...10.0), count: 3000), scalarType: Float.self) for _ in 0...50 { let t = Date() let x = (t1 * t2) print("MLTensor", t.timeIntervalSinceNow * 1000, "ms") } } testMLTensor() The above code took more time than expected, especially in the early stage of iteration.

Machine Learning & AI Core ML ML Compute Accelerate

0

291

Aug ’24

CoreML Crash on iOS18 Beta5

Hello, My App works well on iOS17 and previous iOS18 Beta version, while it crashes on latest iOS18 Beta5, when it calling model predictionFromFeatures. Calling stack of crash is as: *** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: 'Unrecognized ANE execution priority MLANEExecutionPriority_Unspecified' Last Exception Backtrace: 0 CoreFoundation 0x000000019bd6408c __exceptionPreprocess + 164 1 libobjc.A.dylib 0x000000019906b2e4 objc_exception_throw + 88 2 CoreFoundation 0x000000019be5f648 -[NSException initWithCoder:] 3 CoreML 0x00000001b7507340 -[MLE5ExecutionStream _setANEExecutionPriorityWithOptions:] + 248 4 CoreML 0x00000001b7508374 -[MLE5ExecutionStream _prepareForInputFeatures:options:error:] + 248 5 CoreML 0x00000001b7507ddc -[MLE5ExecutionStream executeForInputFeatures:options:error:] + 68 6 CoreML 0x00000001b74ce5c4 -[MLE5Engine _predictionFromFeatures:stream:options:error:] + 80 7 CoreML 0x00000001b74ce7fc -[MLE5Engine _predictionFromFeatures:options:error:] + 208 8 CoreML 0x00000001b74cf110 -[MLE5Engine _predictionFromFeatures:usingState:options:error:] + 400 9 CoreML 0x00000001b74cf270 -[MLE5Engine predictionFromFeatures:options:error:] + 96 10 CoreML 0x00000001b74ab264 -[MLDelegateModel _predictionFromFeatures:usingState:options:error:] + 684 11 CoreML 0x00000001b70991bc -[MLDelegateModel predictionFromFeatures:options:error:] + 124 And my model file type is ml package file. Source code is as below: //model MLModel *_model; ...... // model init MLModelConfiguration* config = [[MLModelConfiguration alloc]init]; config.computeUnits = MLComputeUnitsCPUAndNeuralEngine; _model = [MLModel modelWithContentsOfURL:compileUrl configuration:config error:&error]; ..... // model prediction MLPredictionOptions *option = [[MLPredictionOptions alloc]init]; id<MLFeatureProvider> outFeatures = [_model predictionFromFeatures:_modelInput options:option error:&error]; Is there anything wrong? Any advice would be appreciated.

Machine Learning & AI Core ML Beta Debugging Machine Learning Core ML

3

1

557

Aug ’24

Loading CoreML model increases app size?

Hi, i have been noticing some strange issues with using CoreML models in my app. I am using the Whisper.cpp implementation which has a coreML option. This speeds up the transcribing vs Metal. However every time i use it, the app size inside iphone settings -> General -> Storage increases - specifically the "documents and data" part, the bundle size stays consistent. The Size of the app seems to increase by the same size of the coreml model, and after a few reloads it can increase to over 3-4gb! I thought that maybe the coreml model (which is in the bundle) is being saved to file - but i can't see where, i have tried to use instruments and xcode plus lots of printing out of cache and temp directory etc, deleting the caches etc.. but no effect. I have downloaded the container of the iphone from xcode and inspected it, there are some files stored inthe cache but only a few kbs, and even though the value in the settings-> storage shows a few gb, the container is only a few mb. Please can someone help or give me some guidance on what to do to figure out why the documents and data is increasing? where could this folder be pointing to that is not in the xcode downloaded container?? This is the repo i am using https://github.com/ggerganov/whisper.cpp the swiftui app and objective-C app both do the same thing i am witnessing when using coreml. Thanks in advance for any help, i am totally baffled by this behaviour

Machine Learning & AI Core ML Files and Storage Xcode Machine Learning Core ML

6

3

1.1k

May ’24

iOS 18.1 beta - App crashes at runtime while using Translation.TranslationError in project

I'm trying to cast the error thrown by TranslationSession.translations(from:) as Translation.TranslationError. However, the app crashes at runtime whenever Translation.TranslationError is used in the project. Environment: iOS Version: 18.1 beta Xcode Version: 16 beta yld[14615]: Symbol not found: _$s11Translation0A5ErrorVMa Referenced from: <3426152D-A738-30C1-8F06-47D2C6A1B75B> /private/var/containers/Bundle/Application/043A25BC-E53E-4B28-B71A-C21F77C0D76D/TranslationAPI.app/TranslationAPI.debug.dylib Expected in: /System/Library/Frameworks/Translation.framework/Translation

Machine Learning & AI Core ML ML Compute Natural Language Live Text Apple Intelligence

1

772

Aug ’24

How to deploy Vision Transformer with ANE to Achieve Faster Uncached Load Speed

I wanted to deploy some ViT models on an iPhone. I referred to https://machinelearning.apple.com/research/vision-transformers for deployment and wrote a simple demo based on the code from https://github.com/apple/ml-vision-transformers-ane. However, I found that the uncached load time on the phone is very long. According to the blog, the input is already aligned to 64 bytes, but the speed is still very slow. Is there any way to speed it up? This is my test case: import torch import coremltools as ct import math from torch import nn class SelfAttn(torch.nn.Module): def __init__(self, window_size, num_heads, dim, dim_out): super().__init__() self.window_size = window_size self.num_heads = num_heads self.dim = dim self.dim_out = dim_out self.q_proj = nn.Conv2d( in_channels=dim, out_channels=dim_out, kernel_size=1, ) self.k_proj = nn.Conv2d( in_channels=dim, out_channels=dim_out, kernel_size=1, ) self.v_proj = nn.Conv2d( in_channels=dim, out_channels=dim_out, kernel_size=1, ) def forward(self, x): B, HW, C = x.shape image_shape = (B, C, self.window_size, self.window_size) x_2d = x.permute((0, 2, 1)).reshape(image_shape) # BCHW x_flat = torch.unsqueeze(x.permute((0, 2, 1)), 2) # BC1L q, k, v_2d = self.q_proj(x_flat), self.k_proj(x_flat), self.v_proj(x_2d) mh_q = torch.split(q, self.dim_out // self.num_heads, dim=1) # BC1L mh_v = torch.split( v_2d.reshape(B, -1, x_flat.shape[2], x_flat.shape[3]), self.dim_out // self.num_heads, dim=1 ) mh_k = torch.split( torch.permute(k, (0, 3, 2, 1)), self.dim_out // self.num_heads, dim=3 ) scale_factor = 1 / math.sqrt(mh_q[0].size(1)) attn_weights = [ torch.einsum("bchq, bkhc->bkhq", qi, ki) * scale_factor for qi, ki in zip(mh_q, mh_k) ] attn_weights = [ torch.softmax(aw, dim=1) for aw in attn_weights ] # softmax applied on channel "C" mh_x = [torch.einsum("bkhq,bchk->bchq", wi, vi) for wi, vi in zip(attn_weights, mh_v)] x = torch.cat(mh_x, dim=1) return x window_size = 8 path_batch = 1024 emb_dim = 96 emb_dim_out = 96 x = torch.rand(path_batch, window_size * window_size, emb_dim) qkv_layer = SelfAttn(window_size, 1, emb_dim, emb_dim_out) jit = torch.jit.trace(qkv_layer, (x)) mlmod_fixed_shape = ct.convert( jit, inputs=[ ct.TensorType("x", x.shape), ], convert_to="mlprogram", ) mlmodel_path = "test_ane.mlpackage" mlmod_fixed_shape.save(mlmodel_path) The uncached load took nearly 36 seconds, and it was just a single matrix multiplication.

Machine Learning & AI Core ML

0

1

369

Aug ’24

Bug Report: macOS 15 Beta - PyTorch gridsample Not Utilising Apple Neural Engine on MacBook Pro M2

In macOS 15 beta the gridsample function from PyTorch is not executing as expected on the Apple Neural Engine in MacBook Pro M2. Please find below a Python code snippet that demonstrates the problem: import coremltools as ct import torch.nn as nn import torch.nn.functional as F class PytorchGridSample(torch.nn.Module): def __init__(self, grids): super(PytorchGridSample, self).__init__() self.upsample1 = nn.ConvTranspose2d(512, 256, kernel_size=4, stride=2, padding=1) self.upsample2 = nn.ConvTranspose2d(256, 128, kernel_size=4, stride=2, padding=1) self.upsample3 = nn.ConvTranspose2d(128, 64, kernel_size=4, stride=2, padding=1) self.upsample4 = nn.ConvTranspose2d(64, 32, kernel_size=4, stride=2, padding=1) self.upsample5 = nn.ConvTranspose2d(32, 3, kernel_size=4, stride=2, padding=1) self.grids = grids def forward(self, x): x = self.upsample1(x) x = F.grid_sample(x, self.grids[0], padding_mode='reflection', align_corners=False) x = self.upsample2(x) x = F.grid_sample(x, self.grids[1], padding_mode='reflection', align_corners=False) x = self.upsample3(x) x = F.grid_sample(x, self.grids[2], padding_mode='reflection', align_corners=False) x = self.upsample4(x) x = F.grid_sample(x, self.grids[3], padding_mode='reflection', align_corners=False) x = self.upsample5(x) x = F.grid_sample(x, self.grids[4], padding_mode='reflection', align_corners=False) return x def convert_to_coreml(model, input_): traced_model = torch.jit.trace(model, example_inputs=input_, strict=False) coreml_model = ct.converters.convert( traced_model, inputs=[ct.TensorType(shape=input_.shape)], compute_precision=ct.precision.FLOAT16, minimum_deployment_target=ct.target.macOS14, compute_units=ct.ComputeUnit.ALL ) return coreml_model def main(pt_model, input_): coreml_model = convert_to_coreml(pt_model, input_) coreml_model.save("grid_sample.mlpackage") if __name__ == "__main__": input_tensor = torch.randn(1, 512, 4, 4) grids = [torch.randn(1, 2*i, 2*i, 2) for i in [4, 8, 16, 32, 64, 128]] pt_model = PytorchGridSample(grids) main(pt_model, input_tensor)

Machine Learning & AI Core ML

0

327

Aug ’24

Upgraded to MacOS 15, CoreML models is more slower

After I upgraded to MacOS 15 Beta 4(M1 16G), the sampling speed of apple ml-stable-diffusion was about 40% slower than MacOS 14. And when I recompile and run with xcode 16, the following error will appear: loc("EpicPhoto/Unet.mlmodelc/model.mil":2748:12): error: invalid axis: 4294967296, axis must be in range -|rank| <= axis < |rank| Assertion failed: (0 && "failed to infer output types"), function _inferJITOutputTypes, file GPUBaseOps.mm, line 339. I checked the macos 15 release notes and saw that the problem of slow running of Core ML models was fixed, but it didn't seem to be fixed. Fixed: Inference time for large Core ML models is slower than expected on a subset of M-series SOCs (e.g. M1, M1 max) on macOS. (129682801)

Machine Learning & AI Core ML

2

0

403

Aug ’24

Core ML

Post

Replies

Boosts

Views

Activity