Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.

All subtopics
Posts under Machine Learning & AI topic

Post

Replies

Boosts

Views

Activity

Core Spotlight Semantic Search - still non-functional for 1+ year after WWDC24?
After more than a year since the announcement, I'm still unable to get this feature working properly and wondering if there are known issues or missing implementation details. Current Setup: Device: iPhone 16 Pro Max iOS: 26 beta 3 Development: Tested on both Xcode 16 and Xcode 26 Implementation: Following the official documentation examples The Problem: Semantic search simply doesn't work. Lexical search functions normally, but enabling semantic search produces identical results to having it disabled. It's as if the feature isn't actually processing. Error Output (Xcode 26): [QPNLU][qid=5] Error Domain=com.apple.SpotlightEmbedding.EmbeddingModelError Code=-8007 "Text embedding generation timeout (timeout=100ms)" [CSUserQuery][qid=5] got a nil / empty embedding data dictionary [CSUserQuery][qid=5] semanticQuery failed to generate, using "(false)" In Xcode 16, there are no error messages at all - the semantic search just silently fails. Missing Resources: The sample application mentioned during the WWDC24 presentation doesn't appear to have been released, which makes it difficult to verify if my implementation is correct. Would really appreciate any guidance or clarification on the current status of this feature. Has anyone in the community successfully implemented this?
5
5
2.0k
1d
AI framework usage without user session
We are evaluating various AI frameworks to use within our code, and are hoping to use some of the build-in frameworks in macOS including CoreML and Vision. However, we need to use these frameworks in a background process (system extension) that has no user session attached to it. (To be pedantic, we'll be using an XPC service that is spawned by the system extension, but neither would have an associated user session). Saying the daemon-safe frameworks list has not been updated in a while is an understatement, but it's all we have to go on. CoreGraphics isn't even listed--back then it part of ApplicationServices (I think?) and ApplicationServices is a no go. Vision does use CoreGraphics symbols and data types so I have doubts. We do have a POC that uses both frameworks and they seem to function fine but obviously having something official is better. Any Apple engineers that can comment on this?
3
0
571
2d
Does using Vision API offline to label a custom dataset for Core ML training violate DPLA?
Hello everyone, I am currently developing a smart camera app for iOS that recommends optimal zoom and exposure values on-device using a custom Core ML model. I am still waiting for an official response from Apple Support, but I wanted to ask the community if anyone has experience with a similar workflow regarding App Review and the DPLA. Here is my training methodology: I gathered my own proprietary dataset of original landscape photos. I generated multiple variants of these photos with different zoom and exposure settings offline on my Mac. I used the CalculateImageAestheticsScoresRequest (Vision framework) via a local macOS command-line tool to evaluate and score each variant. Based on those scores, I labeled the "best" zoom and exposure parameters for each original photo. I used this labeled dataset to train my own independent neural network using PyTorch, and then converted it to a Core ML model to ship inside my app. Since the app uses my own custom model on-device and does not send any user data to a server, the privacy aspect is clear. However, I am curious if using the output of Apple's Vision API strictly offline to label my own dataset could be interpreted as "reverse engineering" or a violation of the Developer Program License Agreement (DPLA). Has anyone successfully shipped an app using a similar knowledge distillation or automated dataset labeling approach with Apple's APIs? Did you face any pushback during App Review? Any insights or shared experiences would be greatly appreciated!
1
0
158
2d
After loading my custom model - unsupportedTokenizer error
In Oct25, using mlx_lm.lora I created an adapter and a fused model uploaded to Huggingface. I was able to incorporate this model into my SwiftUI app using the mlx package. MLX-libraries 2.25.8. My base LLM was mlx-community/Mistral-7B-Instruct-v0.3-4bit. Looking at LLMModelFactory.swift the current version 2.29.1 the only changes are the addition of a few models. The earlier model was called: pharmpk/pk-mistral-7b-v0.3-4bit The new model is called: pharmpk/pk-mistral-2026-03-29 The base model (mlx-community/Mistral-7B-Instruct-v0.3-4bit.) must still be available. Could the error 'unsupportedTokenizer' be related to changes in the mlx package? I noticed mention of splitting the package into two parts but don't see anything at github. Feeling rather lost. Does anone have any thoguths and/or suggestions. Thanks, David
3
0
296
3d
26.4 Foundation Model rejects most topics
I have an iOS app, "Spatial Agents" which ran great in 26.3. It creates dashboards around a topic. It can also decompose a topic into sub-topics, and explore those. All based on web articles and web article headlines. In iOS 26.4 almost every topic - even "MIT Innovation" are rejected with an apology of "I apologize I can not fulfill this request". I've tried softening all my prompts, and I can get only really benign very simple topics to respond, but not anything with any significance. It ran great on lots of topics in 26.3. My published App, is now useless, and all my users are unhappy. HELP!
3
0
326
3d
MPS SDPA Attention Kernel Regression on A14-class (M1) in macOS 26.3.1 — Works on A15+ (M2+)
Summary Since macOS 26, our Core ML / MPS inference pipeline produces incorrect results on Mac mini M1 (Macmini9,1, A14-class SoC). The same model and code runs correctly on M2 and newer (A15-class and up). The regression appears to be in the Scaled Dot-Product Attention (SDPA) kernel path in the MPS backend. Environment Affected Mac mini M1 — Macmini9,1 (A14-class) Not affected M2 and newer (A15-class and up) Last known good macOS Sequoia First broken macOS 26 (Tahoe) ? Confirmed broken on macOS 26.3.1 Framework Core ML + MPS backend Language C++ (via CoreML C++ API) Description We ship an audio processing application (VoiceAssist by NoiseWorks) that runs a deep learning model (based on Demucs architecture) via Core ML with the MPS compute unit. On macOS Sequoia this works correctly on all Apple Silicon Macs including M1. After updating to macOS 26 (Tahoe), inference on M1 Macs fails — either producing garbage output or crashing. The same binary, same .mlpackage, same inputs work correctly on M2+. Our Apple contact has suggested the root cause is a regression in the A14-specific MPS SDPA attention kernel, which may have broken when the Metal/MPS stack was updated in macOS 26. The model makes heavy use of attention layers, and the failure correlates precisely with the SDPA path being exercised on A14 hardware. Steps to Reproduce Load a Core ML model that uses Scaled Dot-Product Attention (e.g. a transformer or attention-based audio model) Run inference with MLComputeUnits::cpuAndGPU (MPS active) Run on Mac mini M1 (Macmini9,1) with macOS 26.3.1 Compare output to the same model running on M2 / macOS Sequoia Expected: Correct inference output, consistent with M2+ and macOS Sequoia behavior Actual: Incorrect / corrupted output (or crash), only on A14-class hardware running macOS 26+ Workaround Forcing MLComputeUnits::cpuOnly bypasses MPS entirely and produces correct output on M1, confirming the issue is in the MPS compute path. This is not acceptable as a shipping workaround due to performance impact. Additional Notes The failure is hardware-specific (A14 only) and OS-specific (macOS 26+), pointing to a kernel-level regression rather than a model or app bug We first became aware of this through a customer report Happy to provide a symbolicated crash log if helpful this text was summarized by AI and human verified
2
0
263
3d
CoreML MLE5ProgramLibrary AOT recompilation hangs/crashes on iOS 26.4 — C++ exception in espresso IR compiler bypasses Swift error handling
Area: CoreML / Machine Learning Describe the issue: On iOS 26.4, calling MLModel(contentsOf:configuration:) to load an .mlpackage model hangs indefinitely and eventually kills the app via watchdog. The same model loads and runs inference successfully in under 1 second on iOS 26.3.1. The hang occurs inside eort_eo_compiler_compile_from_ir_program (espresso) during on-device AOT recompilation triggered by MLE5ProgramLibraryOnDeviceAOTCompilationImpl createProgramLibraryHandleWithRespecialization:error:. A C++ exception (__cxa_throw) is thrown inside libBNNS.dylib during the exception unwind, which then hangs inside __cxxabiv1::dyn_cast_slow and __class_type_info::search_below_dst. Swift's try/catch does not catch this — the exception originates in C++ and the process hangs rather than terminating cleanly. Setting config.computeUnits = .cpuOnly does not resolve the issue. MLE5ProgramLibrary initialises as shared infrastructure regardless of compute units. Steps to reproduce: Create an app with an .mlpackage CoreML model using the MLE5/espresso backend Call MLModel(contentsOf: modelURL, configuration: config) at runtime Run on a device on iOS 26.3.1 — loads successfully in <1 second Update device to iOS 26.4 — hangs indefinitely, app killed by watchdog after 60–745 seconds Expected behaviour: Model loads successfully, or throws a catchable Swift error on failure. Actual behaviour: Process hangs in MLE5ProgramLibrary.lazyInitQueue. App killed by watchdog. No Swift error thrown. Full stack trace at point of hang: Thread 1 Queue: com.apple.coreml.MLE5ProgramLibrary.lazyInitQueue (serial) frame 0: __cxxabiv1::__class_type_info::search_below_dst libc++abi.dylib frame 1: __cxxabiv1::(anonymous namespace)::dyn_cast_slow libc++abi.dylib frame 2: ___lldb_unnamed_symbol_23ab44dd4 libBNNS.dylib frame 23: eort_eo_compiler_compile_from_ir_program espresso frame 24: -[MLE5ProgramLibraryOnDeviceAOTCompilationImpl createProgramLibraryHandleWithRespecialization:error:] CoreML frame 25: -[MLE5ProgramLibrary _programLibraryHandleWithForceRespecialization:error:] CoreML frame 26: __44-[MLE5ProgramLibrary prepareAndReturnError:]_block_invoke CoreML frame 27: _dispatch_client_callout libdispatch.dylib frame 28: _dispatch_lane_barrier_sync_invoke_and_complete libdispatch.dylib frame 29: -[MLE5ProgramLibrary prepareAndReturnError:] CoreML frame 30: -[MLE5Engine initWithContainer:configuration:error:] CoreML frame 31: +[MLE5Engine loadModelFromCompiledArchive:modelVersionInfo:compilerVersionInfo:configuration:error:] CoreML frame 32: +[MLLoader _loadModelWithClass:fromArchive:modelVersionInfo:compilerVersionInfo:configuration:error:] CoreML frame 45: +[MLModel modelWithContentsOfURL:configuration:error:] CoreML frame 46: @nonobjc MLModel.__allocating_init(contentsOf:configuration:) GKPersonalV2 frame 47: MDNA_GaitEncoder_v1_3.__allocating_init(contentsOf:configuration:) frame 48: MDNA_GaitEncoder_v1_3.__allocating_init(configuration:) frame 50: GaitModelInference.loadModel() frame 51: GaitModelInference.init() iOS version: Reproduced on iOS 26.4. Works correctly on iOS 26.3.1. Xcode version: 26.2 Device: iPhone (model used in testing) Model format: .mlpackage
4
0
535
4d
Sharing a Swift port of Gemma 4 for mlx-swift-lm — feedback welcome
Hi all, I've been working on a pure-Swift port of Google's Gemma 4 text decoder that plugs into mlx-swift-lm as a sidecar model registration. Sharing it here in case anyone else hit the same wall I did, and to get feedback from the MLX team and the community before I propose anything upstream. Repo: https://github.com/yejingyang8963-byte/Swift-gemma4-core Why As of mlx-swift-lm 2.31.x, Gemma 4 isn't supported out of the box. The obvious workaround — reusing the Gemma 3 text implementation with a patched config — fails at weight load because Gemma 4 differs from Gemma 3 in several structural places. The chat-template path through swift-jinja 1.x also silently corrupts the prompt, so the model loads but generates incoherent text. What's in the package A from-scratch Swift implementation of the Gemma 4 decoder (Configuration, Layers, Attention, MLP, RoPE, DecoderLayer) Per-Layer Embedding (PLE) support — the shared embedding table that feeds every decoder layer through a gated MLP as a third residual KV sharing across the back half of the decoder, threaded through the forward pass via a donor table with a single global rope offset A custom Gemma4ProportionalRoPE class for the partial-rotation rope type that initializeRope doesn't currently recognize A chat-template bypass that builds the prompt as a literal string with the correct turn markers and encodes via tokenizer.encode(text:), matching Python mlx-lm's apply_chat_template byte-for-byte Measured on iPhone (A-series, 7.4 GB RAM) Model: mlx-community/gemma-4-e2b-it-4bit Warm load: ~6 s Memory after load: 341–392 MB Time to first token (end-to-end, 333-token system prompt): 2.82 s Generation throughput: 12–14 tok/s What I'd love feedback on Is the sidecar registration pattern the right way to extend mlx-swift-lm with new model families, or is there a more idiomatic path I missed? The chat-template bypass works but feels like a workaround. Is the right long-term fix in swift-jinja, in the tokenizer, or somewhere else entirely? Anyone running into the same PLE / KV-sharing issues on other Gemma-family checkpoints? I'd like to make sure the implementation generalizes beyond E2B before tagging a 0.2.0. Happy to open a PR against mlx-swift-lm if the maintainers think any of this belongs upstream. Thanks for reading.
1
0
195
4d
Official One-Click Local LLM Deployment for 2019 Mac Pro (7,1) Dual W6900X
I am a professional user of the 2019 Mac Pro (7,1) with dual AMD Radeon Pro W6900X MPX modules (32GB VRAM each). This hardware is designed for high-performance compute, but it is currently crippled for modern local LLM/AI workloads under Linux due to Apple's EFI/PCIe routing restrictions. Core Issue: rocminfo reports "No HIP GPUs available" when attempting to use ROCm/amdgpu on Linux Apple's custom EFI firmware blocks full initialization of professional GPU compute assets The dual W6900X GPUs have 64GB combined VRAM and high-bandwidth Infinity Fabric Link, but cannot be fully utilized for local AI inference/training My Specific Request: Apple should provide an official, one-click deployable application that enables full utilization of dual W6900X GPUs for local large language model (LLM) inference and training under Linux. This application must: Fully initialize both W6900X GPUs via HIP/ROCm, establishing valid compute contexts Bypass artificial EFI/PCIe routing restrictions that block access to professional GPU resources Provide a stable, user-friendly one-click deployment experience (similar to NVIDIA's AI Enterprise or AMD's ROCm Hub) Why This Matters: The 2019 Mac Pro is Apple's flagship professional workstation, marketed for compute-intensive workloads. Its high-cost W6900X GPUs should not be locked down for modern AI/LLM use cases. An official one-click deployment solution would demonstrate Apple's commitment to professional AI and unlock significant value for professional users. I look forward to Apple's response and a clear roadmap for enabling this critical capability. #MacPro #Linux #ROCm #LocalLLM #W6900X #CoreML
2
0
416
5d
CoreML GPU NaN bug with fused QKV attention on macOS Tahoe
Problem: CoreML produces NaN on GPU (works fine on CPU) when running transformer attention with fused QKV projection on macOS 26.2. Root cause: The common::fuse_transpose_matmul optimization pass triggers a Metal kernel bug when sliced tensors feed into matmul(transpose_y=True). Workaround: pipeline = ct.PassPipeline.DEFAULT pipeline.remove_passes(['common::fuse_transpose_matmul']) mlmodel = ct.convert(model, ..., pass_pipeline=pipeline) Minimal repro: https://github.com/imperatormk/coreml-birefnet/blob/main/apple_bug_repro.py Affected: Any ViT/Swin/transformer with fused QKV attention (BiRefNet, etc.) Has anyone else hit this? Filed FB report too.
1
0
495
1w
Siri not calling my INExtension
Things I did: created an Intents Extension target added "Supported Intents" to both my main app target and the intent extension, with "INAddTasksIntent" and "INCreateNoteIntent" created the AppIntentVocabulary in my main app target created the handlers in the code in the Intents Extension target class AddTaskIntentHandler: INExtension, INAddTasksIntentHandling { func resolveTaskTitles(for intent: INAddTasksIntent) async -> [INSpeakableStringResolutionResult] { if let taskTitles = intent.taskTitles { return taskTitles.map { INSpeakableStringResolutionResult.success(with: $0) } } else { return [INSpeakableStringResolutionResult.needsValue()] } } func handle(intent: INAddTasksIntent) async -> INAddTasksIntentResponse { // my code to handle this... let response = INAddTasksIntentResponse(code: .success, userActivity: nil) response.addedTasks = tasksCreated.map { INTask( title: INSpeakableString(spokenPhrase: $0.name), status: .notCompleted, taskType: .completable, spatialEventTrigger: nil, temporalEventTrigger: intent.temporalEventTrigger, createdDateComponents: DateHelper.localCalendar().dateComponents([.year, .month, .day, .minute, .hour], from: Date.now), modifiedDateComponents: nil, identifier: $0.id ) } return response } } class AddItemIntentHandler: INExtension, INCreateNoteIntentHandling { func resolveTitle(for intent: INCreateNoteIntent) async -> INSpeakableStringResolutionResult { if let title = intent.title { return INSpeakableStringResolutionResult.success(with: title) } else { return INSpeakableStringResolutionResult.needsValue() } } func resolveGroupName(for intent: INCreateNoteIntent) async -> INSpeakableStringResolutionResult { if let groupName = intent.groupName { return INSpeakableStringResolutionResult.success(with: groupName) } else { return INSpeakableStringResolutionResult.needsValue() } } func handle(intent: INCreateNoteIntent) async -> INCreateNoteIntentResponse { do { // my code for handling this... let response = INCreateNoteIntentResponse(code: .success, userActivity: nil) response.createdNote = INNote( title: INSpeakableString(spokenPhrase: itemName), contents: itemNote.map { [INTextNoteContent(text: $0)] } ?? [], groupName: INSpeakableString(spokenPhrase: list.name), createdDateComponents: DateHelper.localCalendar().dateComponents([.day, .month, .year, .hour, .minute], from: Date.now), modifiedDateComponents: nil, identifier: newItem.id ) return response } catch { return INCreateNoteIntentResponse(code: .failure, userActivity: nil) } } } uninstalled my app restarted my physical device and simulator Yet, when I say "Remind me to buy dog food in Index" (Index is the name of my app), as stated in the examples of INAddTasksIntent, Siri proceeds to say that a list named "Index" doesn't exist in apple Reminders app, instead of processing the request in my app. Am I missing something?
3
0
600
1w
Shortcut - “Use Model” error handling?
I have a series of shortcuts that I’ve written that use the “Use Model” action to do various things. For example, I have a shortcut “Clipboard Markdown to Notes” that takes the content of the clipboard, creates a new note in Notes, converts the markdown content to rich text, adds it to the note etc. One key step is to analyze the markdown content with “Use Model” and generate a short descriptive title for the note. I use the on-device model for this, but sometimes the content and prompt exceed the context window size and the action fails with an error message to that effect. In that case, I’d like to either repeat the action using the Cloud model, or, if the error was a refusal, to prompt the user to enter a title to use. I‘ve tried using an IF based on whether the response had any text in it, but that didn’t work. No matter what I’ve tried, I can’t seem to find a way to catch the error from Use Model, determine what the error was, and take appropriate action. Is there a way to do this? (And by the way, a huge ”thank you” to whoever had the idea of making AppIntents visible in Shortcuts and adding the Use Model action — has made a huge difference already, and it lets us see what Siri will be able to use as well.)
3
0
637
1w
Unable to use FoundationModels in older app?
Hi, I'm trying to add FoundationModels to an older project but always get the following error: "Unable to resolve 'dependency' 'FoundationModels' import FoundationModels" The error comes and goes while its compiling and then doesn't run the app. I have my target set to 26.0 (and can't go any higher) and am using Xcode 26 (17E192). Is anyone else having this issue? Thanks, Dan Uff
1
0
181
2w
Plenty of LanguageModelSession.GenerationError.refusal errors after 26.4 update
Hello! After the 26.4 update I get a huge number of LanguageModelSession.GenerationError.refusal errors when using guided generation Generables for inexplicable reasons. Such errors also occur, if I want to cast a response to boolean by using 'generating: Bool.self'. The explanation generated on the grounds of the error always looks like this: Response(userPrompt: "", duration: 0.230917542, promptTokenCount: Optional(66), responseTokenCount: Optional(11), feedbackAttachment: nil, content: "I apologize, but I cannot fulfill this request.", rawContent: "I apologize, but I cannot fulfill this request.", transcriptEntries: ArraySlice([])) All the prompts and Generables I use are definitely not profane. Before 26.4 such errors on the same prompts and Generables never occurred. The 26.4 update rendered those features unusable to me. Is this a known bug or what am I doing wrong?
3
0
499
2w
iOS 26.4: Regressions in Foundation Models
After installing iOS 26.4 the Foundation Models instruction following and tool calling capabilities have been degraded significantly. The model is not usable anymore. Examples: This works: "Is the car plugged in?" This does not work: "Tell me if the car is plugged in" Anything with the work "frunk" (front trunk) triggers Guardrail Violation. Phrases like "Lock Pride" also trigger Guardrail Violation (Pride is the name of the car). Tool calling only works half the time for really obvious things.
3
1
496
2w
Memory stride warning when loading CoreML models on ANE
When I am doing an uncached load of CoreML model on ANE, I received this warning in Xcode console Type of hiddenStates in function main's I/O contains unknown strides. Using unknown strides for MIL tensor buffers with unknown shapes is not recommended in E5ML. Please use row_alignment_in_bytes property instead. Refer to https://e5-ml.apple.com/more-info/memory-layouts.html for more information. However, the web link does not seem to be working. Where can I find more information about about this and how can I fix it?
2
0
667
2w
Request: Official One-Click Local LLM Deployment for 2019 Mac Pro (7,1) Dual W6900X
I am a professional user of the 2019 Mac Pro (7,1) with dual AMD Radeon Pro W6900X MPX modules (32GB VRAM each). This hardware is designed for high-performance compute, but it is currently crippled for modern local LLM/AI workloads under Linux due to Apple's EFI/PCIe routing restrictions. Core Issue: rocminfo reports "No HIP GPUs available" when attempting to use ROCm/amdgpu on Linux Apple's custom EFI firmware blocks full initialization of professional GPU compute assets The dual W6900X GPUs have 64GB combined VRAM and high-bandwidth Infinity Fabric Link, but cannot be fully utilized for local AI inference/training My Specific Request: Apple should provide an official, one-click deployable application that enables full utilization of dual W6900X GPUs for local large language model (LLM) inference and training under Linux. This application must: Fully initialize both W6900X GPUs via HIP/ROCm, establishing valid compute contexts Bypass artificial EFI/PCIe routing restrictions that block access to professional GPU resources Provide a stable, user-friendly one-click deployment experience (similar to NVIDIA's AI Enterprise or AMD's ROCm Hub) Why This Matters: The 2019 Mac Pro is Apple's flagship professional workstation, marketed for compute-intensive workloads. Its high-cost W6900X GPUs should not be locked down for modern AI/LLM use cases. An official one-click deployment solution would demonstrate Apple's commitment to professional AI and unlock significant value for professional users. I look forward to Apple's response and a clear roadmap for enabling this critical capability. #MacPro #Linux #ROCm #LocalLLM #W6900X #CoreML
0
0
109
2w
Core Spotlight Semantic Search - still non-functional for 1+ year after WWDC24?
After more than a year since the announcement, I'm still unable to get this feature working properly and wondering if there are known issues or missing implementation details. Current Setup: Device: iPhone 16 Pro Max iOS: 26 beta 3 Development: Tested on both Xcode 16 and Xcode 26 Implementation: Following the official documentation examples The Problem: Semantic search simply doesn't work. Lexical search functions normally, but enabling semantic search produces identical results to having it disabled. It's as if the feature isn't actually processing. Error Output (Xcode 26): [QPNLU][qid=5] Error Domain=com.apple.SpotlightEmbedding.EmbeddingModelError Code=-8007 "Text embedding generation timeout (timeout=100ms)" [CSUserQuery][qid=5] got a nil / empty embedding data dictionary [CSUserQuery][qid=5] semanticQuery failed to generate, using "(false)" In Xcode 16, there are no error messages at all - the semantic search just silently fails. Missing Resources: The sample application mentioned during the WWDC24 presentation doesn't appear to have been released, which makes it difficult to verify if my implementation is correct. Would really appreciate any guidance or clarification on the current status of this feature. Has anyone in the community successfully implemented this?
Replies
5
Boosts
5
Views
2.0k
Activity
1d
Is anyone working on jax-metal?
Hi, I think many of us would love to be able to use our GPUs for Jax on the new Apple Silicon devices, but currently, the Jax-metal plugin is, for all effects and purposes, broken. Is it still under active development? Is there a planned release for a new version? thanks!
Replies
5
Boosts
8
Views
1.4k
Activity
1d
AI framework usage without user session
We are evaluating various AI frameworks to use within our code, and are hoping to use some of the build-in frameworks in macOS including CoreML and Vision. However, we need to use these frameworks in a background process (system extension) that has no user session attached to it. (To be pedantic, we'll be using an XPC service that is spawned by the system extension, but neither would have an associated user session). Saying the daemon-safe frameworks list has not been updated in a while is an understatement, but it's all we have to go on. CoreGraphics isn't even listed--back then it part of ApplicationServices (I think?) and ApplicationServices is a no go. Vision does use CoreGraphics symbols and data types so I have doubts. We do have a POC that uses both frameworks and they seem to function fine but obviously having something official is better. Any Apple engineers that can comment on this?
Replies
3
Boosts
0
Views
571
Activity
2d
Does using Vision API offline to label a custom dataset for Core ML training violate DPLA?
Hello everyone, I am currently developing a smart camera app for iOS that recommends optimal zoom and exposure values on-device using a custom Core ML model. I am still waiting for an official response from Apple Support, but I wanted to ask the community if anyone has experience with a similar workflow regarding App Review and the DPLA. Here is my training methodology: I gathered my own proprietary dataset of original landscape photos. I generated multiple variants of these photos with different zoom and exposure settings offline on my Mac. I used the CalculateImageAestheticsScoresRequest (Vision framework) via a local macOS command-line tool to evaluate and score each variant. Based on those scores, I labeled the "best" zoom and exposure parameters for each original photo. I used this labeled dataset to train my own independent neural network using PyTorch, and then converted it to a Core ML model to ship inside my app. Since the app uses my own custom model on-device and does not send any user data to a server, the privacy aspect is clear. However, I am curious if using the output of Apple's Vision API strictly offline to label my own dataset could be interpreted as "reverse engineering" or a violation of the Developer Program License Agreement (DPLA). Has anyone successfully shipped an app using a similar knowledge distillation or automated dataset labeling approach with Apple's APIs? Did you face any pushback during App Review? Any insights or shared experiences would be greatly appreciated!
Replies
1
Boosts
0
Views
158
Activity
2d
How Is useful AI
I want to introduce how is usefully AI
Replies
0
Boosts
0
Views
43
Activity
3d
After loading my custom model - unsupportedTokenizer error
In Oct25, using mlx_lm.lora I created an adapter and a fused model uploaded to Huggingface. I was able to incorporate this model into my SwiftUI app using the mlx package. MLX-libraries 2.25.8. My base LLM was mlx-community/Mistral-7B-Instruct-v0.3-4bit. Looking at LLMModelFactory.swift the current version 2.29.1 the only changes are the addition of a few models. The earlier model was called: pharmpk/pk-mistral-7b-v0.3-4bit The new model is called: pharmpk/pk-mistral-2026-03-29 The base model (mlx-community/Mistral-7B-Instruct-v0.3-4bit.) must still be available. Could the error 'unsupportedTokenizer' be related to changes in the mlx package? I noticed mention of splitting the package into two parts but don't see anything at github. Feeling rather lost. Does anone have any thoguths and/or suggestions. Thanks, David
Replies
3
Boosts
0
Views
296
Activity
3d
26.4 Foundation Model rejects most topics
I have an iOS app, "Spatial Agents" which ran great in 26.3. It creates dashboards around a topic. It can also decompose a topic into sub-topics, and explore those. All based on web articles and web article headlines. In iOS 26.4 almost every topic - even "MIT Innovation" are rejected with an apology of "I apologize I can not fulfill this request". I've tried softening all my prompts, and I can get only really benign very simple topics to respond, but not anything with any significance. It ran great on lots of topics in 26.3. My published App, is now useless, and all my users are unhappy. HELP!
Replies
3
Boosts
0
Views
326
Activity
3d
MPS SDPA Attention Kernel Regression on A14-class (M1) in macOS 26.3.1 — Works on A15+ (M2+)
Summary Since macOS 26, our Core ML / MPS inference pipeline produces incorrect results on Mac mini M1 (Macmini9,1, A14-class SoC). The same model and code runs correctly on M2 and newer (A15-class and up). The regression appears to be in the Scaled Dot-Product Attention (SDPA) kernel path in the MPS backend. Environment Affected Mac mini M1 — Macmini9,1 (A14-class) Not affected M2 and newer (A15-class and up) Last known good macOS Sequoia First broken macOS 26 (Tahoe) ? Confirmed broken on macOS 26.3.1 Framework Core ML + MPS backend Language C++ (via CoreML C++ API) Description We ship an audio processing application (VoiceAssist by NoiseWorks) that runs a deep learning model (based on Demucs architecture) via Core ML with the MPS compute unit. On macOS Sequoia this works correctly on all Apple Silicon Macs including M1. After updating to macOS 26 (Tahoe), inference on M1 Macs fails — either producing garbage output or crashing. The same binary, same .mlpackage, same inputs work correctly on M2+. Our Apple contact has suggested the root cause is a regression in the A14-specific MPS SDPA attention kernel, which may have broken when the Metal/MPS stack was updated in macOS 26. The model makes heavy use of attention layers, and the failure correlates precisely with the SDPA path being exercised on A14 hardware. Steps to Reproduce Load a Core ML model that uses Scaled Dot-Product Attention (e.g. a transformer or attention-based audio model) Run inference with MLComputeUnits::cpuAndGPU (MPS active) Run on Mac mini M1 (Macmini9,1) with macOS 26.3.1 Compare output to the same model running on M2 / macOS Sequoia Expected: Correct inference output, consistent with M2+ and macOS Sequoia behavior Actual: Incorrect / corrupted output (or crash), only on A14-class hardware running macOS 26+ Workaround Forcing MLComputeUnits::cpuOnly bypasses MPS entirely and produces correct output on M1, confirming the issue is in the MPS compute path. This is not acceptable as a shipping workaround due to performance impact. Additional Notes The failure is hardware-specific (A14 only) and OS-specific (macOS 26+), pointing to a kernel-level regression rather than a model or app bug We first became aware of this through a customer report Happy to provide a symbolicated crash log if helpful this text was summarized by AI and human verified
Replies
2
Boosts
0
Views
263
Activity
3d
CoreML MLE5ProgramLibrary AOT recompilation hangs/crashes on iOS 26.4 — C++ exception in espresso IR compiler bypasses Swift error handling
Area: CoreML / Machine Learning Describe the issue: On iOS 26.4, calling MLModel(contentsOf:configuration:) to load an .mlpackage model hangs indefinitely and eventually kills the app via watchdog. The same model loads and runs inference successfully in under 1 second on iOS 26.3.1. The hang occurs inside eort_eo_compiler_compile_from_ir_program (espresso) during on-device AOT recompilation triggered by MLE5ProgramLibraryOnDeviceAOTCompilationImpl createProgramLibraryHandleWithRespecialization:error:. A C++ exception (__cxa_throw) is thrown inside libBNNS.dylib during the exception unwind, which then hangs inside __cxxabiv1::dyn_cast_slow and __class_type_info::search_below_dst. Swift's try/catch does not catch this — the exception originates in C++ and the process hangs rather than terminating cleanly. Setting config.computeUnits = .cpuOnly does not resolve the issue. MLE5ProgramLibrary initialises as shared infrastructure regardless of compute units. Steps to reproduce: Create an app with an .mlpackage CoreML model using the MLE5/espresso backend Call MLModel(contentsOf: modelURL, configuration: config) at runtime Run on a device on iOS 26.3.1 — loads successfully in <1 second Update device to iOS 26.4 — hangs indefinitely, app killed by watchdog after 60–745 seconds Expected behaviour: Model loads successfully, or throws a catchable Swift error on failure. Actual behaviour: Process hangs in MLE5ProgramLibrary.lazyInitQueue. App killed by watchdog. No Swift error thrown. Full stack trace at point of hang: Thread 1 Queue: com.apple.coreml.MLE5ProgramLibrary.lazyInitQueue (serial) frame 0: __cxxabiv1::__class_type_info::search_below_dst libc++abi.dylib frame 1: __cxxabiv1::(anonymous namespace)::dyn_cast_slow libc++abi.dylib frame 2: ___lldb_unnamed_symbol_23ab44dd4 libBNNS.dylib frame 23: eort_eo_compiler_compile_from_ir_program espresso frame 24: -[MLE5ProgramLibraryOnDeviceAOTCompilationImpl createProgramLibraryHandleWithRespecialization:error:] CoreML frame 25: -[MLE5ProgramLibrary _programLibraryHandleWithForceRespecialization:error:] CoreML frame 26: __44-[MLE5ProgramLibrary prepareAndReturnError:]_block_invoke CoreML frame 27: _dispatch_client_callout libdispatch.dylib frame 28: _dispatch_lane_barrier_sync_invoke_and_complete libdispatch.dylib frame 29: -[MLE5ProgramLibrary prepareAndReturnError:] CoreML frame 30: -[MLE5Engine initWithContainer:configuration:error:] CoreML frame 31: +[MLE5Engine loadModelFromCompiledArchive:modelVersionInfo:compilerVersionInfo:configuration:error:] CoreML frame 32: +[MLLoader _loadModelWithClass:fromArchive:modelVersionInfo:compilerVersionInfo:configuration:error:] CoreML frame 45: +[MLModel modelWithContentsOfURL:configuration:error:] CoreML frame 46: @nonobjc MLModel.__allocating_init(contentsOf:configuration:) GKPersonalV2 frame 47: MDNA_GaitEncoder_v1_3.__allocating_init(contentsOf:configuration:) frame 48: MDNA_GaitEncoder_v1_3.__allocating_init(configuration:) frame 50: GaitModelInference.loadModel() frame 51: GaitModelInference.init() iOS version: Reproduced on iOS 26.4. Works correctly on iOS 26.3.1. Xcode version: 26.2 Device: iPhone (model used in testing) Model format: .mlpackage
Replies
4
Boosts
0
Views
535
Activity
4d
Sharing a Swift port of Gemma 4 for mlx-swift-lm — feedback welcome
Hi all, I've been working on a pure-Swift port of Google's Gemma 4 text decoder that plugs into mlx-swift-lm as a sidecar model registration. Sharing it here in case anyone else hit the same wall I did, and to get feedback from the MLX team and the community before I propose anything upstream. Repo: https://github.com/yejingyang8963-byte/Swift-gemma4-core Why As of mlx-swift-lm 2.31.x, Gemma 4 isn't supported out of the box. The obvious workaround — reusing the Gemma 3 text implementation with a patched config — fails at weight load because Gemma 4 differs from Gemma 3 in several structural places. The chat-template path through swift-jinja 1.x also silently corrupts the prompt, so the model loads but generates incoherent text. What's in the package A from-scratch Swift implementation of the Gemma 4 decoder (Configuration, Layers, Attention, MLP, RoPE, DecoderLayer) Per-Layer Embedding (PLE) support — the shared embedding table that feeds every decoder layer through a gated MLP as a third residual KV sharing across the back half of the decoder, threaded through the forward pass via a donor table with a single global rope offset A custom Gemma4ProportionalRoPE class for the partial-rotation rope type that initializeRope doesn't currently recognize A chat-template bypass that builds the prompt as a literal string with the correct turn markers and encodes via tokenizer.encode(text:), matching Python mlx-lm's apply_chat_template byte-for-byte Measured on iPhone (A-series, 7.4 GB RAM) Model: mlx-community/gemma-4-e2b-it-4bit Warm load: ~6 s Memory after load: 341–392 MB Time to first token (end-to-end, 333-token system prompt): 2.82 s Generation throughput: 12–14 tok/s What I'd love feedback on Is the sidecar registration pattern the right way to extend mlx-swift-lm with new model families, or is there a more idiomatic path I missed? The chat-template bypass works but feels like a workaround. Is the right long-term fix in swift-jinja, in the tokenizer, or somewhere else entirely? Anyone running into the same PLE / KV-sharing issues on other Gemma-family checkpoints? I'd like to make sure the implementation generalizes beyond E2B before tagging a 0.2.0. Happy to open a PR against mlx-swift-lm if the maintainers think any of this belongs upstream. Thanks for reading.
Replies
1
Boosts
0
Views
195
Activity
4d
Official One-Click Local LLM Deployment for 2019 Mac Pro (7,1) Dual W6900X
I am a professional user of the 2019 Mac Pro (7,1) with dual AMD Radeon Pro W6900X MPX modules (32GB VRAM each). This hardware is designed for high-performance compute, but it is currently crippled for modern local LLM/AI workloads under Linux due to Apple's EFI/PCIe routing restrictions. Core Issue: rocminfo reports "No HIP GPUs available" when attempting to use ROCm/amdgpu on Linux Apple's custom EFI firmware blocks full initialization of professional GPU compute assets The dual W6900X GPUs have 64GB combined VRAM and high-bandwidth Infinity Fabric Link, but cannot be fully utilized for local AI inference/training My Specific Request: Apple should provide an official, one-click deployable application that enables full utilization of dual W6900X GPUs for local large language model (LLM) inference and training under Linux. This application must: Fully initialize both W6900X GPUs via HIP/ROCm, establishing valid compute contexts Bypass artificial EFI/PCIe routing restrictions that block access to professional GPU resources Provide a stable, user-friendly one-click deployment experience (similar to NVIDIA's AI Enterprise or AMD's ROCm Hub) Why This Matters: The 2019 Mac Pro is Apple's flagship professional workstation, marketed for compute-intensive workloads. Its high-cost W6900X GPUs should not be locked down for modern AI/LLM use cases. An official one-click deployment solution would demonstrate Apple's commitment to professional AI and unlock significant value for professional users. I look forward to Apple's response and a clear roadmap for enabling this critical capability. #MacPro #Linux #ROCm #LocalLLM #W6900X #CoreML
Replies
2
Boosts
0
Views
416
Activity
5d
CoreML GPU NaN bug with fused QKV attention on macOS Tahoe
Problem: CoreML produces NaN on GPU (works fine on CPU) when running transformer attention with fused QKV projection on macOS 26.2. Root cause: The common::fuse_transpose_matmul optimization pass triggers a Metal kernel bug when sliced tensors feed into matmul(transpose_y=True). Workaround: pipeline = ct.PassPipeline.DEFAULT pipeline.remove_passes(['common::fuse_transpose_matmul']) mlmodel = ct.convert(model, ..., pass_pipeline=pipeline) Minimal repro: https://github.com/imperatormk/coreml-birefnet/blob/main/apple_bug_repro.py Affected: Any ViT/Swin/transformer with fused QKV attention (BiRefNet, etc.) Has anyone else hit this? Filed FB report too.
Replies
1
Boosts
0
Views
495
Activity
1w
Apple Swift Replacing Python
This YouTube video is very interesting, discussing Swift's power and its potential to replace Python. Here is the link. https://youtu.be/6ZGlseSqar0?si=pzZVq9FKsveca4kA
Replies
0
Boosts
0
Views
164
Activity
1w
Siri not calling my INExtension
Things I did: created an Intents Extension target added "Supported Intents" to both my main app target and the intent extension, with "INAddTasksIntent" and "INCreateNoteIntent" created the AppIntentVocabulary in my main app target created the handlers in the code in the Intents Extension target class AddTaskIntentHandler: INExtension, INAddTasksIntentHandling { func resolveTaskTitles(for intent: INAddTasksIntent) async -> [INSpeakableStringResolutionResult] { if let taskTitles = intent.taskTitles { return taskTitles.map { INSpeakableStringResolutionResult.success(with: $0) } } else { return [INSpeakableStringResolutionResult.needsValue()] } } func handle(intent: INAddTasksIntent) async -> INAddTasksIntentResponse { // my code to handle this... let response = INAddTasksIntentResponse(code: .success, userActivity: nil) response.addedTasks = tasksCreated.map { INTask( title: INSpeakableString(spokenPhrase: $0.name), status: .notCompleted, taskType: .completable, spatialEventTrigger: nil, temporalEventTrigger: intent.temporalEventTrigger, createdDateComponents: DateHelper.localCalendar().dateComponents([.year, .month, .day, .minute, .hour], from: Date.now), modifiedDateComponents: nil, identifier: $0.id ) } return response } } class AddItemIntentHandler: INExtension, INCreateNoteIntentHandling { func resolveTitle(for intent: INCreateNoteIntent) async -> INSpeakableStringResolutionResult { if let title = intent.title { return INSpeakableStringResolutionResult.success(with: title) } else { return INSpeakableStringResolutionResult.needsValue() } } func resolveGroupName(for intent: INCreateNoteIntent) async -> INSpeakableStringResolutionResult { if let groupName = intent.groupName { return INSpeakableStringResolutionResult.success(with: groupName) } else { return INSpeakableStringResolutionResult.needsValue() } } func handle(intent: INCreateNoteIntent) async -> INCreateNoteIntentResponse { do { // my code for handling this... let response = INCreateNoteIntentResponse(code: .success, userActivity: nil) response.createdNote = INNote( title: INSpeakableString(spokenPhrase: itemName), contents: itemNote.map { [INTextNoteContent(text: $0)] } ?? [], groupName: INSpeakableString(spokenPhrase: list.name), createdDateComponents: DateHelper.localCalendar().dateComponents([.day, .month, .year, .hour, .minute], from: Date.now), modifiedDateComponents: nil, identifier: newItem.id ) return response } catch { return INCreateNoteIntentResponse(code: .failure, userActivity: nil) } } } uninstalled my app restarted my physical device and simulator Yet, when I say "Remind me to buy dog food in Index" (Index is the name of my app), as stated in the examples of INAddTasksIntent, Siri proceeds to say that a list named "Index" doesn't exist in apple Reminders app, instead of processing the request in my app. Am I missing something?
Replies
3
Boosts
0
Views
600
Activity
1w
Shortcut - “Use Model” error handling?
I have a series of shortcuts that I’ve written that use the “Use Model” action to do various things. For example, I have a shortcut “Clipboard Markdown to Notes” that takes the content of the clipboard, creates a new note in Notes, converts the markdown content to rich text, adds it to the note etc. One key step is to analyze the markdown content with “Use Model” and generate a short descriptive title for the note. I use the on-device model for this, but sometimes the content and prompt exceed the context window size and the action fails with an error message to that effect. In that case, I’d like to either repeat the action using the Cloud model, or, if the error was a refusal, to prompt the user to enter a title to use. I‘ve tried using an IF based on whether the response had any text in it, but that didn’t work. No matter what I’ve tried, I can’t seem to find a way to catch the error from Use Model, determine what the error was, and take appropriate action. Is there a way to do this? (And by the way, a huge ”thank you” to whoever had the idea of making AppIntents visible in Shortcuts and adding the Use Model action — has made a huge difference already, and it lets us see what Siri will be able to use as well.)
Replies
3
Boosts
0
Views
637
Activity
1w
Unable to use FoundationModels in older app?
Hi, I'm trying to add FoundationModels to an older project but always get the following error: "Unable to resolve 'dependency' 'FoundationModels' import FoundationModels" The error comes and goes while its compiling and then doesn't run the app. I have my target set to 26.0 (and can't go any higher) and am using Xcode 26 (17E192). Is anyone else having this issue? Thanks, Dan Uff
Replies
1
Boosts
0
Views
181
Activity
2w
Plenty of LanguageModelSession.GenerationError.refusal errors after 26.4 update
Hello! After the 26.4 update I get a huge number of LanguageModelSession.GenerationError.refusal errors when using guided generation Generables for inexplicable reasons. Such errors also occur, if I want to cast a response to boolean by using 'generating: Bool.self'. The explanation generated on the grounds of the error always looks like this: Response(userPrompt: "", duration: 0.230917542, promptTokenCount: Optional(66), responseTokenCount: Optional(11), feedbackAttachment: nil, content: "I apologize, but I cannot fulfill this request.", rawContent: "I apologize, but I cannot fulfill this request.", transcriptEntries: ArraySlice([])) All the prompts and Generables I use are definitely not profane. Before 26.4 such errors on the same prompts and Generables never occurred. The 26.4 update rendered those features unusable to me. Is this a known bug or what am I doing wrong?
Replies
3
Boosts
0
Views
499
Activity
2w
iOS 26.4: Regressions in Foundation Models
After installing iOS 26.4 the Foundation Models instruction following and tool calling capabilities have been degraded significantly. The model is not usable anymore. Examples: This works: "Is the car plugged in?" This does not work: "Tell me if the car is plugged in" Anything with the work "frunk" (front trunk) triggers Guardrail Violation. Phrases like "Lock Pride" also trigger Guardrail Violation (Pride is the name of the car). Tool calling only works half the time for really obvious things.
Replies
3
Boosts
1
Views
496
Activity
2w
Memory stride warning when loading CoreML models on ANE
When I am doing an uncached load of CoreML model on ANE, I received this warning in Xcode console Type of hiddenStates in function main's I/O contains unknown strides. Using unknown strides for MIL tensor buffers with unknown shapes is not recommended in E5ML. Please use row_alignment_in_bytes property instead. Refer to https://e5-ml.apple.com/more-info/memory-layouts.html for more information. However, the web link does not seem to be working. Where can I find more information about about this and how can I fix it?
Replies
2
Boosts
0
Views
667
Activity
2w
Request: Official One-Click Local LLM Deployment for 2019 Mac Pro (7,1) Dual W6900X
I am a professional user of the 2019 Mac Pro (7,1) with dual AMD Radeon Pro W6900X MPX modules (32GB VRAM each). This hardware is designed for high-performance compute, but it is currently crippled for modern local LLM/AI workloads under Linux due to Apple's EFI/PCIe routing restrictions. Core Issue: rocminfo reports "No HIP GPUs available" when attempting to use ROCm/amdgpu on Linux Apple's custom EFI firmware blocks full initialization of professional GPU compute assets The dual W6900X GPUs have 64GB combined VRAM and high-bandwidth Infinity Fabric Link, but cannot be fully utilized for local AI inference/training My Specific Request: Apple should provide an official, one-click deployable application that enables full utilization of dual W6900X GPUs for local large language model (LLM) inference and training under Linux. This application must: Fully initialize both W6900X GPUs via HIP/ROCm, establishing valid compute contexts Bypass artificial EFI/PCIe routing restrictions that block access to professional GPU resources Provide a stable, user-friendly one-click deployment experience (similar to NVIDIA's AI Enterprise or AMD's ROCm Hub) Why This Matters: The 2019 Mac Pro is Apple's flagship professional workstation, marketed for compute-intensive workloads. Its high-cost W6900X GPUs should not be locked down for modern AI/LLM use cases. An official one-click deployment solution would demonstrate Apple's commitment to professional AI and unlock significant value for professional users. I look forward to Apple's response and a clear roadmap for enabling this critical capability. #MacPro #Linux #ROCm #LocalLLM #W6900X #CoreML
Replies
0
Boosts
0
Views
109
Activity
2w