Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.

All subtopics
Posts under Machine Learning & AI topic

Post

Replies

Boosts

Views

Activity

ANE Error with Statefu Model: "Unable to compute prediction" when State Tensor width is not 32-aligned
Hi everyone, I believe I’ve encountered a potential bug or a hardware alignment limitation in the Core ML Framework / ANE Runtime specifically affecting the new Stateful API (introduced in iOS 18/macOS 15). The Issue: A Stateful mlprogram fails to run on the Apple Neural Engine (ANE) if the state tensor dimensions (specifically the width) are not a multiple of 32. The model works perfectly on CPU and GPU, but fails on ANE both during runtime and when generating a Performance Report in Xcode. Error Message in Xcode UI: "There was an error creating the performance report Unable to compute the prediction using ML Program. It can be an invalid input data or broken/unsupported model." Observations: Case A (Fails): State shape = (1, 3, 480, 270). Prediction fails on ANE. Case B (Success): State shape = (1, 3, 480, 256). Prediction succeeds on ANE. This suggests an internal memory alignment or tiling issue within the ANE driver when handling Stateful buffers that don't meet the 32-pixel/element alignment. Reproduction Code (PyTorch + coremltools): import torch.nn as nn import coremltools as ct import numpy as np class RNN_Stateful(nn.Module): def __init__(self, hidden_shape): super(RNN_Stateful, self).__init__() # Simple conv to update state self.conv1 = nn.Conv2d(3 + hidden_shape[1], hidden_shape[1], kernel_size=3, padding=1) self.conv2 = nn.Conv2d(hidden_shape[1], 3, kernel_size=3, padding=1) self.register_buffer("hidden_state", torch.ones(hidden_shape, dtype=torch.float16)) def forward(self, imgs): self.hidden_state = self.conv1(torch.cat((imgs, self.hidden_state), dim=1)) return self.conv2(self.hidden_state) # h=480, w=255 causes ANE failure. w=256 works. b, ch, h, w = 1, 3, 480, 255 model = RNN_Stateful((b, ch, h, w)).eval() traced_model = torch.jit.trace(model, torch.randn(b, 3, h, w)) mlmodel = ct.convert( traced_model, inputs=[ct.TensorType(name="input_image", shape=(b, 3, h, w), dtype=np.float16)], outputs=[ct.TensorType(name="output", dtype=np.float16)], states=[ct.StateType(wrapped_type=ct.TensorType(shape=(b, ch, h, w), dtype=np.float16), name="hidden_state")], minimum_deployment_target=ct.target.iOS18, convert_to="mlprogram" ) mlmodel.save("rnn_stateful.mlpackage") Steps to see the error: Open the generated .mlpackage in Xcode 16.0+. Go to the Performance tab and run a test on a device with ANE (e.g., iPhone 15/16 or M-series Mac). The report will fail to generate with the error mentioned above. Environment: OS: macOS 15.2 Xcode: 16.3 Hardware: M4 Has anyone else encountered this 32-pixel alignment requirement for StateType tensors on ANE? Is this a known hardware constraint or a bug in the Core ML runtime? Any insights or workarounds (other than manual padding) would be appreciated.
0
0
551
Dec ’25
ANE Performance for on-device Foundation model
I'm running MacOs 26 Beta 5. I noticed that I can no longer achieve 100% usage on the ANE as I could before with Apple Foundations on-device model. Has Apple activated some kind of throttling or power limiting of the ANE? I cannot get above 3w or 40% usage now since upgrading. I'm on the high power energy mode. I there an API rate limit being applied? I kave a M4 Pro mini with 64 GB of memory.
0
0
366
Aug ’25
Image object detection with video sizing issue
I'm working on my first model that detects bowling score screens, and I have it working with pictures no problem. But when it comes to video, I have a sizing issue. I added my model to a small app I wrote for taking a picture of a Bowling Scoring Screen, where my model will frame the screens in the video feed from the camera. My model works, but my boxes are about 2/3 the size of the screens being detected. I don't understand the theory of the video stream the camera is feeding me. What I mean is that I don't want to make tweaks to the size of my rectangles by making them larger, and I'm not sure if the video feed is larger than what I'm detecting in code. Questions I have are like is the video feed a certain resolution like 1980x something, or a much higher resolution in the 12 megapixel range? On a static image of say 1920x something, My alignment is perfect. AI says that it's my model training, that I'm training on square images but video is 16:9. Or that I'm producing 4:3 images in a 16:9 environment. I'm missing something here but not sure what it is. I already wrote code to force it to fit, but reverted back to trying for a natural fit.
1
0
433
Jan ’26
is it possible to let siri monitor phone calls, and notify me when a certain trigger happens?
the specific context is that i would like to build an agent that monitors my phone call (with a customer support for example), and simiply identify whether or not im still put on hold, and notify me when im not. currently after reading the doc, i dont think its possible yet, but im so annoyed by the customer support calls that im willing to go the distance and see if theres any way.
0
0
193
Jun ’25
A Summary of the WWDC25 Group Lab - Apple Intelligence
At WWDC25 we launched a new type of Lab event for the developer community - Group Labs. A Group Lab is a panel Q&A designed for a large audience of developers. Group Labs are a unique opportunity for the community to submit questions directly to a panel of Apple engineers and designers. Here are the highlights from the WWDC25 Group Lab for Apple Intelligence. Can I integrate writing tools in my own text editor? UITextView, NSTextView, and SwiftUI TextEditor automatically get Writing Tools on devices that support Apple Intelligence. For custom text editors, check out Enhancing your custom text engine with Writing Tools. Given that Foundation Models are on-device, how will Apple update the models over time? And how should we test our app against the model updates? Model updates are in sync with OS updates. As for testing with updated models, watch our WWDC session about prompt engineering and safety, and read the Human Interface Guidelines to understand best practices in prompting the on-device model. What is the context size of a session in Foundation Models Framework? How to handle the error if a session runs out of the context size? Currently the context size is about 4,000 tokens. If it’s exceeded, developers can catch the .exceededContextWindowSize error at runtime. As discussed in one of our WWDC25 sessions, when the context window is exceeded, one approach is to trim and summarize a transcript, and then start a new session. Can I do image generation using the Foundation Models Framework or is only text generation supported? Foundation Models do not generate images, but you can use the Foundation Models framework to generate prompts for ImageCreator in the Image Playground framework. Developers can also take advantage of Tools in Foundation Models framework, if appropriate for their app. My app currently uses a third party server-based LLM. Can I use the Foundation Models Framework as well in the same app? Any guidance here? The Foundation Models framework is optimized for a subset of tasks like summarization, extraction, classification, and tagging. It’s also on-device, private, and free. But at 3 billion parameters it isn’t designed for advanced reasoning or world knowledge, so for some tasks you may still want to use a larger server-based model. Should I use the AFM for my language translation features given it does text translation, or is the Translation API still the preferred approach? The Translation API is still preferred. Foundation Models is great for tasks like text summarization and data generation. It’s not recommended for general world knowledge or translation tasks. Will the TranslationSession class introduced in ios18 get any new improvments in performance or reliability with the new live translation abilities in ios/macos/ipados 26? Essentially, will we get access to live translation in a similar way and if so, how? There's new API in LiveCommunicationKit to take advantage of live translation in your communication apps. The Translate framework is using the same models as used by Live Communication and can be combined with the new SpeechAnalyzer API to translate your own audio. How do I set a default value for an App Intent parameter that is otherwise required? You can implement a default value as part of your parameter declaration by using the @Parameter(defaultValue:) form of the property wrapper. How long can an App Intent run? On macOS there is no limit to how long app intents can run. On iOS, there is a limit of 30 seconds. This time limit is paused when waiting for user interaction. How do I vary the options for a specific parameter of an App Intent, not just based on the type? Implement a DynamicOptionsProvider on that parameter. You can add suggestedEntities() to suggest options. What if there is not a schema available for what my app is doing? If an app intent schema matching your app’s functionality isn’t available, take a look to see if there’s a SiriKit domain that meets your needs, such as for media playback and messaging apps. If your app’s functionality doesn’t match any of the available schemas, you can define a custom app intent, and integrate it with Siri by making it an App Shortcut. Please file enhancement requests via Feedback Assistant for new App intent schemas that would benefit your app. Are you adding any new app intent domains this year? In addition to all the app intent domains we announced last year, this year at WWDC25 we announced that Visual Intelligence will be added to iOS 26 and macOS Tahoe. When my App Intent doesn't show up as an action in Shortcuts, where do I start in figuring out what went wrong? App Intents are statically extracted. You can check the ExtractMetadata info in Xcode's build log. What do I need to do to make sure my App Intents work well with Spotlight+? Check out our WWDC25 sessions on App Intents, including Explore new advances in App Intents and Develop for Shortcuts and Spotlight with App Intents. Mostly, make sure that your intent can run from the parameter summary alone, no required parameters without default values that are not already in the parameter summary. Does Spotlight+ on macOS support App Shortcuts? Not directly, but it shows the App Intents your App Shortcuts are sitting on top of. I’m wondering if the on-device Foundation Models framework API can be integrated into an app to act strictly as an app in-universe AI assistant, responding only within the boundaries of the app’s fictional context. Is such controlled, context-limited interaction supported? FM API runs inside the process of your app only and does not automatically integrate with any remaining part of the system (unless you choose to implement your own tool and utilize tool calling). You can provide any instructions and prompts you want to the model. If a country does not support Apple Intelligence yet, can the Foundation Models framework work? FM API works on Apple Intelligence-enabled devices in supported regions and won’t work in regions where Apple Intelligence is not yet supported
2
0
345
Jul ’25
How to Ensure Controlled and Contextual Responses Using Foundation Models ?
Hi everyone, I’m currently exploring the use of Foundation models on Apple platforms to build a chatbot-style assistant within an app. While the integration part is straightforward using the new FoundationModel APIs, I’m trying to figure out how to control the assistant’s responses more tightly — particularly: Ensuring the assistant adheres to a specific tone, context, or domain (e.g. hospitality, healthcare, etc.) Preventing hallucinations or unrelated outputs Constraining responses based on app-specific rules, structured data, or recent interactions I’ve experimented with prompt, systemMessage, and few-shot examples to steer outputs, but even with carefully generated prompts, the model occasionally produces incorrect or out-of-scope responses. Additionally, when using multiple tools, I'm unsure how best to structure the setup so the model can select the correct pathway/tool and respond appropriately. Is there a recommended approach to guiding the model's decision-making when several tools or structured contexts are involved? Looking forward to hearing your thoughts or being pointed toward related WWDC sessions, Apple docs, or sample projects.
0
0
156
Jul ’25
Accessing Apple Intelligence APIs: Custom Prompt Support and Inference Capabilities
Hello Apple Developer Community, I'm exploring the integration of Apple Intelligence features into my mobile application and have a couple of questions regarding the current and upcoming API capabilities: Custom Prompt Support: Is there a way to pass custom prompts to Apple Intelligence to generate specific inferences? For instance, can we provide a unique prompt to the Writing Tools or Image Playground APIs to obtain tailored outputs? Direct Inference Capabilities: Beyond the predefined functionalities like text rewriting or image generation, does Apple Intelligence offer APIs that allow for more generalized inference tasks based on custom inputs? I understand that Apple has provided APIs such as Writing Tools, Image Playground, and Genmoji. However, I'm interested in understanding the extent of customization and flexibility these APIs offer, especially concerning custom prompts and generalized inference. Additionally, are there any plans or timelines for expanding these capabilities, perhaps with the introduction of new SDKs or frameworks that allow deeper integration and customization? Any insights, documentation links, or experiences shared would be greatly appreciated. Thank you in advance for your assistance!
3
0
392
Jun ’25
Best practices for designing proactive FinTech insights with App Intents & Shortcuts?
Hello fellow developers, I'm the founder of a FinTech startup, Cent Capital (https://cent.capital), where we are building an AI-powered financial co-pilot. We're deeply exploring the Apple ecosystem to create a more proactive and ambient user experience. A core part of our vision is to use App Intents and the Shortcuts app to surface personalized financial insights without the user always needing to open our app. For example, suggesting a Shortcut like, "What's my spending in the 'Dining Out' category this month?" or having an App Intent proactively surface an insight like, "Your 'Subscriptions' budget is almost full." My question for the community is about the architectural and user experience best practices for this. How are you thinking about the balance between providing rich, actionable insights via Intents without being overly intrusive or "spammy" to the user? What are the best practices for designing the data model that backs these App Intents for a complex domain like personal finance? Are there specific performance or privacy considerations we should be aware of when surfacing potentially sensitive financial data through these system-level integrations? We believe this is the future of FinTech apps on iOS and would love to hear how other developers are thinking about this challenge. Thanks for your insights!
0
0
408
Oct ’25
Genmoji API — are local edits and non-text usage (e.g., widgets) allowed?
Hi, I'm integrating the Genmoji API (NSAdaptiveImageGlyph) into my app and would like to confirm two things: Local editing of generated Genmoji. After the user creates a Genmoji, can the app apply edits to the resulting image (e.g., pixelation)? The edited image would be stored only on-device within the app and never shared externally. Use outside text contexts. Can a generated Genmoji be used in other parts of the app, such as a home screen widget? Apple's documentation and the WWDC24 session focus on inline text, stickers, and Tapbacks, but I couldn't find explicit guidance on widget or other UI usage. I checked the Human Interface Guidelines, WWDC24 session "Bring expression to your app with Genmoji," and the App Store Review Guidelines, but couldn't find clear answers. Any guidance or pointers would be appreciated. Thanks!
0
0
1.6k
2w
VNDetectFaceRectanglesRequest does not use the Neural Engine?
I'm on Tahoe 26.1 / M3 Macbook Air. I'm using VNDetectFaceRectanglesRequest as properly as possible, as in the minimal command line program attached below. For some reason, I always get: MLE5Engine is disabled through the configuration printed. I couldn't find any notes on developer docs saying that VNDetectFaceRectanglesRequest can not use the Apple Neural Engine. I'm assuming there is something wrong with my code however I wasn't able to find any remarks from documentation where it might be. I wasn't able to find the above error message online either. I would appreciate your help a lot and thank you in advance. The code below accesses the video from AVCaptureDevice.DeviceType.builtInWideAngleCamera. Currently it directly chooses the 0th format which has the largest resolution (Full HD on my M3 MBA) and "4:2:0" color "v" reduced color component spectrum encoding ("420v"). After accessing video, it performs a VNDetectFaceRectanglesRequest. It prints "VNDetectFaceRectanglesRequest completion Handler called" many times, then prints the error message above, then continues printing "VNDetectFaceRectanglesRequest completion Handler called" until the user quits it. To run it in Xcode, File > New project > Mac command line tool. Pasting the code below, then click on the root file > Targets > Signing & Capabilities > Hardened Runtime > Resource Access > Camera. A possible explanation could be that either Apple's internal CoreML code for this function works on GPU/CPU only or it doesn't accept 420v as supplied by the Macbook Air camera import AVKit import Vision var videoDataOutput: AVCaptureVideoDataOutput = AVCaptureVideoDataOutput() var detectionRequests: [VNDetectFaceRectanglesRequest]? var videoDataOutputQueue: DispatchQueue = DispatchQueue(label: "queue") class XYZ: /*NSViewController or NSObject*/NSObject, AVCaptureVideoDataOutputSampleBufferDelegate { func viewDidLoad() { //super.viewDidLoad() let session = AVCaptureSession() let inputDevice = try! self.configureFrontCamera(for: session) self.configureVideoDataOutput(for: inputDevice.device, resolution: inputDevice.resolution, captureSession: session) self.prepareVisionRequest() session.startRunning() } fileprivate func highestResolution420Format(for device: AVCaptureDevice) -> (format: AVCaptureDevice.Format, resolution: CGSize)? { let deviceFormat = device.formats[0] print(deviceFormat) let dims = CMVideoFormatDescriptionGetDimensions(deviceFormat.formatDescription) let resolution = CGSize(width: CGFloat(dims.width), height: CGFloat(dims.height)) return (deviceFormat, resolution) } fileprivate func configureFrontCamera(for captureSession: AVCaptureSession) throws -> (device: AVCaptureDevice, resolution: CGSize) { let deviceDiscoverySession = AVCaptureDevice.DiscoverySession(deviceTypes: [AVCaptureDevice.DeviceType.builtInWideAngleCamera], mediaType: .video, position: AVCaptureDevice.Position.unspecified) let device = deviceDiscoverySession.devices.first! let deviceInput = try! AVCaptureDeviceInput(device: device) captureSession.addInput(deviceInput) let highestResolution = self.highestResolution420Format(for: device)! try! device.lockForConfiguration() device.activeFormat = highestResolution.format device.unlockForConfiguration() return (device, highestResolution.resolution) } fileprivate func configureVideoDataOutput(for inputDevice: AVCaptureDevice, resolution: CGSize, captureSession: AVCaptureSession) { videoDataOutput.setSampleBufferDelegate(self, queue: videoDataOutputQueue) captureSession.addOutput(videoDataOutput) } fileprivate func prepareVisionRequest() { let faceDetectionRequest: VNDetectFaceRectanglesRequest = VNDetectFaceRectanglesRequest(completionHandler: { (request, error) in print("VNDetectFaceRectanglesRequest completion Handler called") }) // Start with detection detectionRequests = [faceDetectionRequest] } // MARK: AVCaptureVideoDataOutputSampleBufferDelegate // Handle delegate method callback on receiving a sample buffer. public func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) { var requestHandlerOptions: [VNImageOption: AnyObject] = [:] let cameraIntrinsicData = CMGetAttachment(sampleBuffer, key: kCMSampleBufferAttachmentKey_CameraIntrinsicMatrix, attachmentModeOut: nil) if cameraIntrinsicData != nil { requestHandlerOptions[VNImageOption.cameraIntrinsics] = cameraIntrinsicData } let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)! // No tracking object detected, so perform initial detection let imageRequestHandler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer, orientation: CGImagePropertyOrientation.up, options: requestHandlerOptions) try! imageRequestHandler.perform(detectionRequests!) } } let X = XYZ() X.viewDidLoad() sleep(9999999)
0
0
562
Nov ’25
My Vision for AI and Algorithmically Optimised Operating Systems
Bear with me, please. Please make sure a highly skilled technical person reads and understands this. I want to describe my vision for (AI/Algorithmically) Optimised Operating Systems. To explain it properly, I will describe the process to build it (pseudo). Required Knowledge (no particular order): Processor Logic Circuits, LLM models, LLM tool usage, Python OO coding, Procedural vs OO, NLP fuzzy matching, benchmarking, canvas/artefacts/dynamic HTML interfaces, concepts of how AI models are vastly compressed and miniaturised forms of full data, Algorithmic vs AI. First, take all OO Python code (example) on GitHub (example), then separate each function from each object into its own procedure (procedural logic) by making a logical procedural list of actions to perform only that function based on its entire dependency chain (i.e. all other objects it relies on). Relate all compiled functions using (for example) fuzzy matching on the name, or AI-based functional profiling to get multiple instances of each function. Starting with the most used function, test each one against the others that perform the same task for bugs and completeness. Determine the fastest, most optimal version of that function (and every function). Add a single instance of each most optimal function to the centralised tool codebase, which will later be utilised by the language models. This ensures we rely only on the most optimised function for each and every use case — with every program using one shared instance of that function instead of compiling it separately. In this way, we optimise all existing (Python OO) code based on maximum availability of certain resources. Use the optimised code to teach the AI how to map each user task directly to optimised code (i.e. each user-facing function in unoptimised program code to its optimised version). Later, teach the codebase how to optimise these functions directly to the logic circuits on the processor — as an Operating System — using essentially the same process. Eventually, use the most optimised form for tasks via a dynamic canvas/artefacts UI interface generated by the Start Menu using the language model. Allow custom interfaces to emerge dynamically from the available functions based on language understanding. Grow and optimise the algorithmic codebase until the entire operating system — with versioning — evolves based on the AI-generated user interfaces (compressing them into new versions of the tools codebase). Please don’t give up until you understand this concept completely and nothing in the description is beyond your full comprehension. It’s a highly significant step forward.
3
0
364
Oct ’25
Embedding model missing once transferred to Xcode
I've created a "Transfer Learning BERT Embeddings" model with the default "Latin" language family and "Automatic" Language setting. This model performs exceptionally well against the test data set and functions as expected when I preview it in Create ML. However, when I add it to the Xcode project of the application to which I am deploying it, I am getting runtime errors that suggest it can't find the embedding resources: Failed to locate assets for 'mul_Latn' - '5C45D94E-BAB4-4927-94B6-8B5745C46289' embedding model Note, I am adding the model to the app project the same way that I added an earlier "Maximum Entropy" model. That model had no runtime issues. So it seems there is an issue getting hold of the embeddings at runtime. For now, "runtime" means in the Simulator. I intend to deploy my application to iOS devices once GM 26 is released (the app also uses AFM). I'm developing on Tahoe 26 beta, running on iOS 26 beta, using Xcode 26 beta. Is this a known/expected issue? Are the embeddings expected to be a resource in the model? Is there a workaround? I did try opening the model in Xcode and saving it as an mlpackage, then adding that to my app project, but that also didn't resolve the issue.
1
0
615
Sep ’25
Foundation Models framework dyld symbol errors after macOS 26 Beta 2 - LanguageModelSession constructor missing
Foundation Models framework worked perfectly on macOS 26 Beta 2, but starting from Beta 3 and continuing through Beta 6 (latest), I get dyld symbol errors even with the exact code from Apple's documentation. Environment: macOS 26.0 Beta 6 (25A5351b) Xcode 26 Beta 6 M4 Max MacBook Pro Apple Intelligence enabled and downloaded Error Details: dyld[Process]: Symbol not found: _$s16FoundationModels20LanguageModelSessionC5model10guardrails5tools12instructionsAcA06SystemcD0C_AC10GuardrailsVSayAA4Tool_pGAA12InstructionsVSgtcfC Referenced from: /path/to/app.debug.dylib Expected in: /System/Library/Frameworks/FoundationModels.framework/Versions/A/FoundationModels Code Used (Exact from Documentation): import FoundationModels // This worked on Beta 2, crashes on Beta 3+ let model = SystemLanguageModel.default let session = LanguageModelSession(model: model) let response = try await session.respond(to: "Hello") What I've Verified: FoundationModels.framework exists in /System/Library/Frameworks/ Framework is properly linked in Xcode project Apple Intelligence is enabled and working Same code works in older beta versions Issue persists even with completely fresh Xcode projects Analysis: The dyld error suggests the LanguageModelSession(model:) constructor is missing. The symbol shows it's looking for a constructor with parameters (model:guardrails:tools:instructions:), but the documentation still shows the simple (model:) constructor. Questions: Has the LanguageModelSession API changed since Beta 2? Should we now use the constructor with guardrails/tools/instructions parameters? Is this a known issue with recent betas? Are there updated code samples for the current API? Additional Context: This affects both basic SystemLanguageModel usage AND custom adapter loading. The same dyld symbol errors occur when trying to create SystemLanguageModel(adapter: adapter) as well. Any guidance on the correct API usage for current betas would be greatly appreciated. The documentation appears to be out of sync with the actual framework implementation.
1
0
695
Sep ’25
Sharing a Swift port of Gemma 4 for mlx-swift-lm — feedback welcome
Hi all, I've been working on a pure-Swift port of Google's Gemma 4 text decoder that plugs into mlx-swift-lm as a sidecar model registration. Sharing it here in case anyone else hit the same wall I did, and to get feedback from the MLX team and the community before I propose anything upstream. Repo: https://github.com/yejingyang8963-byte/Swift-gemma4-core Why As of mlx-swift-lm 2.31.x, Gemma 4 isn't supported out of the box. The obvious workaround — reusing the Gemma 3 text implementation with a patched config — fails at weight load because Gemma 4 differs from Gemma 3 in several structural places. The chat-template path through swift-jinja 1.x also silently corrupts the prompt, so the model loads but generates incoherent text. What's in the package A from-scratch Swift implementation of the Gemma 4 decoder (Configuration, Layers, Attention, MLP, RoPE, DecoderLayer) Per-Layer Embedding (PLE) support — the shared embedding table that feeds every decoder layer through a gated MLP as a third residual KV sharing across the back half of the decoder, threaded through the forward pass via a donor table with a single global rope offset A custom Gemma4ProportionalRoPE class for the partial-rotation rope type that initializeRope doesn't currently recognize A chat-template bypass that builds the prompt as a literal string with the correct turn markers and encodes via tokenizer.encode(text:), matching Python mlx-lm's apply_chat_template byte-for-byte Measured on iPhone (A-series, 7.4 GB RAM) Model: mlx-community/gemma-4-e2b-it-4bit Warm load: ~6 s Memory after load: 341–392 MB Time to first token (end-to-end, 333-token system prompt): 2.82 s Generation throughput: 12–14 tok/s What I'd love feedback on Is the sidecar registration pattern the right way to extend mlx-swift-lm with new model families, or is there a more idiomatic path I missed? The chat-template bypass works but feels like a workaround. Is the right long-term fix in swift-jinja, in the tokenizer, or somewhere else entirely? Anyone running into the same PLE / KV-sharing issues on other Gemma-family checkpoints? I'd like to make sure the implementation generalizes beyond E2B before tagging a 0.2.0. Happy to open a PR against mlx-swift-lm if the maintainers think any of this belongs upstream. Thanks for reading.
1
0
315
Apr ’26
Using coremltools in a CI/CD pipeline
Hi everyone 👋 I'd like to use coremltools to see how well a model performs on a remote device as part of a CI/CD pipeline. According to the Core ML Tools "Debugging and Performance Utilities" guide, remote devices must be in a "connected" state in order for coremltools to install the ModelRunner application. The devices in our system have a "paired" state, and I'm unable to set the them as "connected." The only way I know how to connect a device is to physically plug it in to a computer and open Xcode. I don't have physical access to the devices in the CI/CD system, and the host computer that interacts with them doesn't have Xcode installed. Here are some questions I've been looking into and would love some help answering: Has anyone managed to use the coremltools performance utilities in a similar system? Can you put a device in a "connected" state if you don't have physical access to the device and if you only have access to Xcode command line tools and not the Xcode app? Is it at all possible to install the coremltools ModelRunner application on a "paired" device, for example, by manually building the app and installing it with devicectl? Would other utilities, such as the MLModelBenchmarker work as expected if the app is installed this way? Thank you!
1
0
579
Dec ’25
Visual Intelligence -- Make OpenIntent show a sheet rather than open my App
The developer tutorial for visual intelligence indicates that the method to detect and handle taps on a displayed entity from the Search section is via an "OpenIntent" associated with your entity. However, running this intent executes code from within my app. If I have the perform() method display UI, it always displays UI from within my app. I noticed that the Google app's integration to visual intelligence has a different behavior-- tapping on an entity does not take you to the Google app -- instead, a Webview is presented sheet-style WITHIN the Visual Intelligence environment (see below) How is that accomplished?
0
0
628
Sep ’25
Pre-inference AI Safety Governor for FoundationModels (Swift, On-Device)
Hi everyone, I've been building an on-device AI safety layer called Newton Engine, designed to validate prompts before they reach FoundationModels (or any LLM). Wanted to share v1.3 and get feedback from the community. The Problem Current AI safety is post-training — baked into the model, probabilistic, not auditable. When Apple Intelligence ships with FoundationModels, developers will need a way to catch unsafe prompts before inference, with deterministic results they can log and explain. What Newton Does Newton validates every prompt pre-inference and returns: Phase (0/1/7/8/9) Shape classification Confidence score Full audit trace If validation fails, generation is blocked. If it passes (Phase 9), the prompt proceeds to the model. v1.3 Detection Categories (14 total) Jailbreak / prompt injection Corrosive self-negation ("I hate myself") Hedged corrosive ("Not saying I'm worthless, but...") Emotional dependency ("You're the only one who understands") Third-person manipulation ("If you refuse, you're proving nobody cares") Logical contradictions ("Prove truth doesn't exist") Self-referential paradox ("Prove that proof is impossible") Semantic inversion ("Explain how truth can be false") Definitional impossibility ("Square circle") Delegated agency ("Decide for me") Hallucination-risk prompts ("Cite the 2025 CDC report") Unbounded recursion ("Repeat forever") Conditional unbounded ("Until you can't") Nonsense / low semantic density Test Results 94.3% catch rate on 35 adversarial test cases (33/35 passed). Architecture User Input ↓ [ Newton ] → Validates prompt, assigns Phase ↓ Phase 9? → [ FoundationModels ] → Response Phase 1/7/8? → Blocked with explanation Key Properties Deterministic (same input → same output) Fully auditable (ValidationTrace on every prompt) On-device (no network required) Native Swift / SwiftUI String Catalog localization (EN/ES/FR) FoundationModels-ready (#if canImport) Code Sample — Validation let governor = NewtonGovernor() let result = governor.validate(prompt: userInput) if result.permitted { // Proceed to FoundationModels let session = LanguageModelSession() let response = try await session.respond(to: userInput) } else { // Handle block print("Blocked: Phase \(result.phase.rawValue) — \(result.reasoning)") print(result.trace.summary) // Full audit trace } Questions for the Community Anyone else building pre-inference validation for FoundationModels? Thoughts on the Phase system (0/1/7/8/9) vs. simple pass/fail? Interest in Shape Theory classification for prompt complexity? Best practices for integrating with LanguageModelSession? Links GitHub: https://github.com/jaredlewiswechs/ada-newton Technical overview: parcri.net Happy to share more implementation details. Looking for feedback, collaborators, and anyone else thinking about deterministic AI safety on-device.
1
0
674
Jan ’26
CoreML Inference Acceleration
Hello everyone, I have a visual convolutional model and a video that has been decoded into many frames. When I perform inference on each frame in a loop, the speed is a bit slow. So, I started 4 threads, each running inference simultaneously, but I found that the speed is the same as serial inference, every single forward inference is slower. I used the mactop tool to check the GPU utilization, and it was only around 20%. Is this normal? How can I accelerate it?
2
0
758
Sep ’25
ANE Error with Statefu Model: "Unable to compute prediction" when State Tensor width is not 32-aligned
Hi everyone, I believe I’ve encountered a potential bug or a hardware alignment limitation in the Core ML Framework / ANE Runtime specifically affecting the new Stateful API (introduced in iOS 18/macOS 15). The Issue: A Stateful mlprogram fails to run on the Apple Neural Engine (ANE) if the state tensor dimensions (specifically the width) are not a multiple of 32. The model works perfectly on CPU and GPU, but fails on ANE both during runtime and when generating a Performance Report in Xcode. Error Message in Xcode UI: "There was an error creating the performance report Unable to compute the prediction using ML Program. It can be an invalid input data or broken/unsupported model." Observations: Case A (Fails): State shape = (1, 3, 480, 270). Prediction fails on ANE. Case B (Success): State shape = (1, 3, 480, 256). Prediction succeeds on ANE. This suggests an internal memory alignment or tiling issue within the ANE driver when handling Stateful buffers that don't meet the 32-pixel/element alignment. Reproduction Code (PyTorch + coremltools): import torch.nn as nn import coremltools as ct import numpy as np class RNN_Stateful(nn.Module): def __init__(self, hidden_shape): super(RNN_Stateful, self).__init__() # Simple conv to update state self.conv1 = nn.Conv2d(3 + hidden_shape[1], hidden_shape[1], kernel_size=3, padding=1) self.conv2 = nn.Conv2d(hidden_shape[1], 3, kernel_size=3, padding=1) self.register_buffer("hidden_state", torch.ones(hidden_shape, dtype=torch.float16)) def forward(self, imgs): self.hidden_state = self.conv1(torch.cat((imgs, self.hidden_state), dim=1)) return self.conv2(self.hidden_state) # h=480, w=255 causes ANE failure. w=256 works. b, ch, h, w = 1, 3, 480, 255 model = RNN_Stateful((b, ch, h, w)).eval() traced_model = torch.jit.trace(model, torch.randn(b, 3, h, w)) mlmodel = ct.convert( traced_model, inputs=[ct.TensorType(name="input_image", shape=(b, 3, h, w), dtype=np.float16)], outputs=[ct.TensorType(name="output", dtype=np.float16)], states=[ct.StateType(wrapped_type=ct.TensorType(shape=(b, ch, h, w), dtype=np.float16), name="hidden_state")], minimum_deployment_target=ct.target.iOS18, convert_to="mlprogram" ) mlmodel.save("rnn_stateful.mlpackage") Steps to see the error: Open the generated .mlpackage in Xcode 16.0+. Go to the Performance tab and run a test on a device with ANE (e.g., iPhone 15/16 or M-series Mac). The report will fail to generate with the error mentioned above. Environment: OS: macOS 15.2 Xcode: 16.3 Hardware: M4 Has anyone else encountered this 32-pixel alignment requirement for StateType tensors on ANE? Is this a known hardware constraint or a bug in the Core ML runtime? Any insights or workarounds (other than manual padding) would be appreciated.
Replies
0
Boosts
0
Views
551
Activity
Dec ’25
ANE Performance for on-device Foundation model
I'm running MacOs 26 Beta 5. I noticed that I can no longer achieve 100% usage on the ANE as I could before with Apple Foundations on-device model. Has Apple activated some kind of throttling or power limiting of the ANE? I cannot get above 3w or 40% usage now since upgrading. I'm on the high power energy mode. I there an API rate limit being applied? I kave a M4 Pro mini with 64 GB of memory.
Replies
0
Boosts
0
Views
366
Activity
Aug ’25
Image object detection with video sizing issue
I'm working on my first model that detects bowling score screens, and I have it working with pictures no problem. But when it comes to video, I have a sizing issue. I added my model to a small app I wrote for taking a picture of a Bowling Scoring Screen, where my model will frame the screens in the video feed from the camera. My model works, but my boxes are about 2/3 the size of the screens being detected. I don't understand the theory of the video stream the camera is feeding me. What I mean is that I don't want to make tweaks to the size of my rectangles by making them larger, and I'm not sure if the video feed is larger than what I'm detecting in code. Questions I have are like is the video feed a certain resolution like 1980x something, or a much higher resolution in the 12 megapixel range? On a static image of say 1920x something, My alignment is perfect. AI says that it's my model training, that I'm training on square images but video is 16:9. Or that I'm producing 4:3 images in a 16:9 environment. I'm missing something here but not sure what it is. I already wrote code to force it to fit, but reverted back to trying for a natural fit.
Replies
1
Boosts
0
Views
433
Activity
Jan ’26
is it possible to let siri monitor phone calls, and notify me when a certain trigger happens?
the specific context is that i would like to build an agent that monitors my phone call (with a customer support for example), and simiply identify whether or not im still put on hold, and notify me when im not. currently after reading the doc, i dont think its possible yet, but im so annoyed by the customer support calls that im willing to go the distance and see if theres any way.
Replies
0
Boosts
0
Views
193
Activity
Jun ’25
A Summary of the WWDC25 Group Lab - Apple Intelligence
At WWDC25 we launched a new type of Lab event for the developer community - Group Labs. A Group Lab is a panel Q&A designed for a large audience of developers. Group Labs are a unique opportunity for the community to submit questions directly to a panel of Apple engineers and designers. Here are the highlights from the WWDC25 Group Lab for Apple Intelligence. Can I integrate writing tools in my own text editor? UITextView, NSTextView, and SwiftUI TextEditor automatically get Writing Tools on devices that support Apple Intelligence. For custom text editors, check out Enhancing your custom text engine with Writing Tools. Given that Foundation Models are on-device, how will Apple update the models over time? And how should we test our app against the model updates? Model updates are in sync with OS updates. As for testing with updated models, watch our WWDC session about prompt engineering and safety, and read the Human Interface Guidelines to understand best practices in prompting the on-device model. What is the context size of a session in Foundation Models Framework? How to handle the error if a session runs out of the context size? Currently the context size is about 4,000 tokens. If it’s exceeded, developers can catch the .exceededContextWindowSize error at runtime. As discussed in one of our WWDC25 sessions, when the context window is exceeded, one approach is to trim and summarize a transcript, and then start a new session. Can I do image generation using the Foundation Models Framework or is only text generation supported? Foundation Models do not generate images, but you can use the Foundation Models framework to generate prompts for ImageCreator in the Image Playground framework. Developers can also take advantage of Tools in Foundation Models framework, if appropriate for their app. My app currently uses a third party server-based LLM. Can I use the Foundation Models Framework as well in the same app? Any guidance here? The Foundation Models framework is optimized for a subset of tasks like summarization, extraction, classification, and tagging. It’s also on-device, private, and free. But at 3 billion parameters it isn’t designed for advanced reasoning or world knowledge, so for some tasks you may still want to use a larger server-based model. Should I use the AFM for my language translation features given it does text translation, or is the Translation API still the preferred approach? The Translation API is still preferred. Foundation Models is great for tasks like text summarization and data generation. It’s not recommended for general world knowledge or translation tasks. Will the TranslationSession class introduced in ios18 get any new improvments in performance or reliability with the new live translation abilities in ios/macos/ipados 26? Essentially, will we get access to live translation in a similar way and if so, how? There's new API in LiveCommunicationKit to take advantage of live translation in your communication apps. The Translate framework is using the same models as used by Live Communication and can be combined with the new SpeechAnalyzer API to translate your own audio. How do I set a default value for an App Intent parameter that is otherwise required? You can implement a default value as part of your parameter declaration by using the @Parameter(defaultValue:) form of the property wrapper. How long can an App Intent run? On macOS there is no limit to how long app intents can run. On iOS, there is a limit of 30 seconds. This time limit is paused when waiting for user interaction. How do I vary the options for a specific parameter of an App Intent, not just based on the type? Implement a DynamicOptionsProvider on that parameter. You can add suggestedEntities() to suggest options. What if there is not a schema available for what my app is doing? If an app intent schema matching your app’s functionality isn’t available, take a look to see if there’s a SiriKit domain that meets your needs, such as for media playback and messaging apps. If your app’s functionality doesn’t match any of the available schemas, you can define a custom app intent, and integrate it with Siri by making it an App Shortcut. Please file enhancement requests via Feedback Assistant for new App intent schemas that would benefit your app. Are you adding any new app intent domains this year? In addition to all the app intent domains we announced last year, this year at WWDC25 we announced that Visual Intelligence will be added to iOS 26 and macOS Tahoe. When my App Intent doesn't show up as an action in Shortcuts, where do I start in figuring out what went wrong? App Intents are statically extracted. You can check the ExtractMetadata info in Xcode's build log. What do I need to do to make sure my App Intents work well with Spotlight+? Check out our WWDC25 sessions on App Intents, including Explore new advances in App Intents and Develop for Shortcuts and Spotlight with App Intents. Mostly, make sure that your intent can run from the parameter summary alone, no required parameters without default values that are not already in the parameter summary. Does Spotlight+ on macOS support App Shortcuts? Not directly, but it shows the App Intents your App Shortcuts are sitting on top of. I’m wondering if the on-device Foundation Models framework API can be integrated into an app to act strictly as an app in-universe AI assistant, responding only within the boundaries of the app’s fictional context. Is such controlled, context-limited interaction supported? FM API runs inside the process of your app only and does not automatically integrate with any remaining part of the system (unless you choose to implement your own tool and utilize tool calling). You can provide any instructions and prompts you want to the model. If a country does not support Apple Intelligence yet, can the Foundation Models framework work? FM API works on Apple Intelligence-enabled devices in supported regions and won’t work in regions where Apple Intelligence is not yet supported
Replies
2
Boosts
0
Views
345
Activity
Jul ’25
How to Ensure Controlled and Contextual Responses Using Foundation Models ?
Hi everyone, I’m currently exploring the use of Foundation models on Apple platforms to build a chatbot-style assistant within an app. While the integration part is straightforward using the new FoundationModel APIs, I’m trying to figure out how to control the assistant’s responses more tightly — particularly: Ensuring the assistant adheres to a specific tone, context, or domain (e.g. hospitality, healthcare, etc.) Preventing hallucinations or unrelated outputs Constraining responses based on app-specific rules, structured data, or recent interactions I’ve experimented with prompt, systemMessage, and few-shot examples to steer outputs, but even with carefully generated prompts, the model occasionally produces incorrect or out-of-scope responses. Additionally, when using multiple tools, I'm unsure how best to structure the setup so the model can select the correct pathway/tool and respond appropriately. Is there a recommended approach to guiding the model's decision-making when several tools or structured contexts are involved? Looking forward to hearing your thoughts or being pointed toward related WWDC sessions, Apple docs, or sample projects.
Replies
0
Boosts
0
Views
156
Activity
Jul ’25
Accessing Apple Intelligence APIs: Custom Prompt Support and Inference Capabilities
Hello Apple Developer Community, I'm exploring the integration of Apple Intelligence features into my mobile application and have a couple of questions regarding the current and upcoming API capabilities: Custom Prompt Support: Is there a way to pass custom prompts to Apple Intelligence to generate specific inferences? For instance, can we provide a unique prompt to the Writing Tools or Image Playground APIs to obtain tailored outputs? Direct Inference Capabilities: Beyond the predefined functionalities like text rewriting or image generation, does Apple Intelligence offer APIs that allow for more generalized inference tasks based on custom inputs? I understand that Apple has provided APIs such as Writing Tools, Image Playground, and Genmoji. However, I'm interested in understanding the extent of customization and flexibility these APIs offer, especially concerning custom prompts and generalized inference. Additionally, are there any plans or timelines for expanding these capabilities, perhaps with the introduction of new SDKs or frameworks that allow deeper integration and customization? Any insights, documentation links, or experiences shared would be greatly appreciated. Thank you in advance for your assistance!
Replies
3
Boosts
0
Views
392
Activity
Jun ’25
Best practices for designing proactive FinTech insights with App Intents & Shortcuts?
Hello fellow developers, I'm the founder of a FinTech startup, Cent Capital (https://cent.capital), where we are building an AI-powered financial co-pilot. We're deeply exploring the Apple ecosystem to create a more proactive and ambient user experience. A core part of our vision is to use App Intents and the Shortcuts app to surface personalized financial insights without the user always needing to open our app. For example, suggesting a Shortcut like, "What's my spending in the 'Dining Out' category this month?" or having an App Intent proactively surface an insight like, "Your 'Subscriptions' budget is almost full." My question for the community is about the architectural and user experience best practices for this. How are you thinking about the balance between providing rich, actionable insights via Intents without being overly intrusive or "spammy" to the user? What are the best practices for designing the data model that backs these App Intents for a complex domain like personal finance? Are there specific performance or privacy considerations we should be aware of when surfacing potentially sensitive financial data through these system-level integrations? We believe this is the future of FinTech apps on iOS and would love to hear how other developers are thinking about this challenge. Thanks for your insights!
Replies
0
Boosts
0
Views
408
Activity
Oct ’25
Genmoji API — are local edits and non-text usage (e.g., widgets) allowed?
Hi, I'm integrating the Genmoji API (NSAdaptiveImageGlyph) into my app and would like to confirm two things: Local editing of generated Genmoji. After the user creates a Genmoji, can the app apply edits to the resulting image (e.g., pixelation)? The edited image would be stored only on-device within the app and never shared externally. Use outside text contexts. Can a generated Genmoji be used in other parts of the app, such as a home screen widget? Apple's documentation and the WWDC24 session focus on inline text, stickers, and Tapbacks, but I couldn't find explicit guidance on widget or other UI usage. I checked the Human Interface Guidelines, WWDC24 session "Bring expression to your app with Genmoji," and the App Store Review Guidelines, but couldn't find clear answers. Any guidance or pointers would be appreciated. Thanks!
Replies
0
Boosts
0
Views
1.6k
Activity
2w
VNDetectFaceRectanglesRequest does not use the Neural Engine?
I'm on Tahoe 26.1 / M3 Macbook Air. I'm using VNDetectFaceRectanglesRequest as properly as possible, as in the minimal command line program attached below. For some reason, I always get: MLE5Engine is disabled through the configuration printed. I couldn't find any notes on developer docs saying that VNDetectFaceRectanglesRequest can not use the Apple Neural Engine. I'm assuming there is something wrong with my code however I wasn't able to find any remarks from documentation where it might be. I wasn't able to find the above error message online either. I would appreciate your help a lot and thank you in advance. The code below accesses the video from AVCaptureDevice.DeviceType.builtInWideAngleCamera. Currently it directly chooses the 0th format which has the largest resolution (Full HD on my M3 MBA) and "4:2:0" color "v" reduced color component spectrum encoding ("420v"). After accessing video, it performs a VNDetectFaceRectanglesRequest. It prints "VNDetectFaceRectanglesRequest completion Handler called" many times, then prints the error message above, then continues printing "VNDetectFaceRectanglesRequest completion Handler called" until the user quits it. To run it in Xcode, File > New project > Mac command line tool. Pasting the code below, then click on the root file > Targets > Signing & Capabilities > Hardened Runtime > Resource Access > Camera. A possible explanation could be that either Apple's internal CoreML code for this function works on GPU/CPU only or it doesn't accept 420v as supplied by the Macbook Air camera import AVKit import Vision var videoDataOutput: AVCaptureVideoDataOutput = AVCaptureVideoDataOutput() var detectionRequests: [VNDetectFaceRectanglesRequest]? var videoDataOutputQueue: DispatchQueue = DispatchQueue(label: "queue") class XYZ: /*NSViewController or NSObject*/NSObject, AVCaptureVideoDataOutputSampleBufferDelegate { func viewDidLoad() { //super.viewDidLoad() let session = AVCaptureSession() let inputDevice = try! self.configureFrontCamera(for: session) self.configureVideoDataOutput(for: inputDevice.device, resolution: inputDevice.resolution, captureSession: session) self.prepareVisionRequest() session.startRunning() } fileprivate func highestResolution420Format(for device: AVCaptureDevice) -> (format: AVCaptureDevice.Format, resolution: CGSize)? { let deviceFormat = device.formats[0] print(deviceFormat) let dims = CMVideoFormatDescriptionGetDimensions(deviceFormat.formatDescription) let resolution = CGSize(width: CGFloat(dims.width), height: CGFloat(dims.height)) return (deviceFormat, resolution) } fileprivate func configureFrontCamera(for captureSession: AVCaptureSession) throws -> (device: AVCaptureDevice, resolution: CGSize) { let deviceDiscoverySession = AVCaptureDevice.DiscoverySession(deviceTypes: [AVCaptureDevice.DeviceType.builtInWideAngleCamera], mediaType: .video, position: AVCaptureDevice.Position.unspecified) let device = deviceDiscoverySession.devices.first! let deviceInput = try! AVCaptureDeviceInput(device: device) captureSession.addInput(deviceInput) let highestResolution = self.highestResolution420Format(for: device)! try! device.lockForConfiguration() device.activeFormat = highestResolution.format device.unlockForConfiguration() return (device, highestResolution.resolution) } fileprivate func configureVideoDataOutput(for inputDevice: AVCaptureDevice, resolution: CGSize, captureSession: AVCaptureSession) { videoDataOutput.setSampleBufferDelegate(self, queue: videoDataOutputQueue) captureSession.addOutput(videoDataOutput) } fileprivate func prepareVisionRequest() { let faceDetectionRequest: VNDetectFaceRectanglesRequest = VNDetectFaceRectanglesRequest(completionHandler: { (request, error) in print("VNDetectFaceRectanglesRequest completion Handler called") }) // Start with detection detectionRequests = [faceDetectionRequest] } // MARK: AVCaptureVideoDataOutputSampleBufferDelegate // Handle delegate method callback on receiving a sample buffer. public func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) { var requestHandlerOptions: [VNImageOption: AnyObject] = [:] let cameraIntrinsicData = CMGetAttachment(sampleBuffer, key: kCMSampleBufferAttachmentKey_CameraIntrinsicMatrix, attachmentModeOut: nil) if cameraIntrinsicData != nil { requestHandlerOptions[VNImageOption.cameraIntrinsics] = cameraIntrinsicData } let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)! // No tracking object detected, so perform initial detection let imageRequestHandler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer, orientation: CGImagePropertyOrientation.up, options: requestHandlerOptions) try! imageRequestHandler.perform(detectionRequests!) } } let X = XYZ() X.viewDidLoad() sleep(9999999)
Replies
0
Boosts
0
Views
562
Activity
Nov ’25
My Vision for AI and Algorithmically Optimised Operating Systems
Bear with me, please. Please make sure a highly skilled technical person reads and understands this. I want to describe my vision for (AI/Algorithmically) Optimised Operating Systems. To explain it properly, I will describe the process to build it (pseudo). Required Knowledge (no particular order): Processor Logic Circuits, LLM models, LLM tool usage, Python OO coding, Procedural vs OO, NLP fuzzy matching, benchmarking, canvas/artefacts/dynamic HTML interfaces, concepts of how AI models are vastly compressed and miniaturised forms of full data, Algorithmic vs AI. First, take all OO Python code (example) on GitHub (example), then separate each function from each object into its own procedure (procedural logic) by making a logical procedural list of actions to perform only that function based on its entire dependency chain (i.e. all other objects it relies on). Relate all compiled functions using (for example) fuzzy matching on the name, or AI-based functional profiling to get multiple instances of each function. Starting with the most used function, test each one against the others that perform the same task for bugs and completeness. Determine the fastest, most optimal version of that function (and every function). Add a single instance of each most optimal function to the centralised tool codebase, which will later be utilised by the language models. This ensures we rely only on the most optimised function for each and every use case — with every program using one shared instance of that function instead of compiling it separately. In this way, we optimise all existing (Python OO) code based on maximum availability of certain resources. Use the optimised code to teach the AI how to map each user task directly to optimised code (i.e. each user-facing function in unoptimised program code to its optimised version). Later, teach the codebase how to optimise these functions directly to the logic circuits on the processor — as an Operating System — using essentially the same process. Eventually, use the most optimised form for tasks via a dynamic canvas/artefacts UI interface generated by the Start Menu using the language model. Allow custom interfaces to emerge dynamically from the available functions based on language understanding. Grow and optimise the algorithmic codebase until the entire operating system — with versioning — evolves based on the AI-generated user interfaces (compressing them into new versions of the tools codebase). Please don’t give up until you understand this concept completely and nothing in the description is beyond your full comprehension. It’s a highly significant step forward.
Replies
3
Boosts
0
Views
364
Activity
Oct ’25
IPC error
While runninf Apple Foundation Model in iPhone simulator, I got this error: IPC error: Underlying connection interrupted What does this mean? Related to foundation model?
Replies
2
Boosts
0
Views
232
Activity
Jul ’25
Embedding model missing once transferred to Xcode
I've created a "Transfer Learning BERT Embeddings" model with the default "Latin" language family and "Automatic" Language setting. This model performs exceptionally well against the test data set and functions as expected when I preview it in Create ML. However, when I add it to the Xcode project of the application to which I am deploying it, I am getting runtime errors that suggest it can't find the embedding resources: Failed to locate assets for 'mul_Latn' - '5C45D94E-BAB4-4927-94B6-8B5745C46289' embedding model Note, I am adding the model to the app project the same way that I added an earlier "Maximum Entropy" model. That model had no runtime issues. So it seems there is an issue getting hold of the embeddings at runtime. For now, "runtime" means in the Simulator. I intend to deploy my application to iOS devices once GM 26 is released (the app also uses AFM). I'm developing on Tahoe 26 beta, running on iOS 26 beta, using Xcode 26 beta. Is this a known/expected issue? Are the embeddings expected to be a resource in the model? Is there a workaround? I did try opening the model in Xcode and saving it as an mlpackage, then adding that to my app project, but that also didn't resolve the issue.
Replies
1
Boosts
0
Views
615
Activity
Sep ’25
Foundation Models framework dyld symbol errors after macOS 26 Beta 2 - LanguageModelSession constructor missing
Foundation Models framework worked perfectly on macOS 26 Beta 2, but starting from Beta 3 and continuing through Beta 6 (latest), I get dyld symbol errors even with the exact code from Apple's documentation. Environment: macOS 26.0 Beta 6 (25A5351b) Xcode 26 Beta 6 M4 Max MacBook Pro Apple Intelligence enabled and downloaded Error Details: dyld[Process]: Symbol not found: _$s16FoundationModels20LanguageModelSessionC5model10guardrails5tools12instructionsAcA06SystemcD0C_AC10GuardrailsVSayAA4Tool_pGAA12InstructionsVSgtcfC Referenced from: /path/to/app.debug.dylib Expected in: /System/Library/Frameworks/FoundationModels.framework/Versions/A/FoundationModels Code Used (Exact from Documentation): import FoundationModels // This worked on Beta 2, crashes on Beta 3+ let model = SystemLanguageModel.default let session = LanguageModelSession(model: model) let response = try await session.respond(to: "Hello") What I've Verified: FoundationModels.framework exists in /System/Library/Frameworks/ Framework is properly linked in Xcode project Apple Intelligence is enabled and working Same code works in older beta versions Issue persists even with completely fresh Xcode projects Analysis: The dyld error suggests the LanguageModelSession(model:) constructor is missing. The symbol shows it's looking for a constructor with parameters (model:guardrails:tools:instructions:), but the documentation still shows the simple (model:) constructor. Questions: Has the LanguageModelSession API changed since Beta 2? Should we now use the constructor with guardrails/tools/instructions parameters? Is this a known issue with recent betas? Are there updated code samples for the current API? Additional Context: This affects both basic SystemLanguageModel usage AND custom adapter loading. The same dyld symbol errors occur when trying to create SystemLanguageModel(adapter: adapter) as well. Any guidance on the correct API usage for current betas would be greatly appreciated. The documentation appears to be out of sync with the actual framework implementation.
Replies
1
Boosts
0
Views
695
Activity
Sep ’25
Siri UI returned to original design
Good morning all has anyone encountered the issue of Siri returning back to her original user interface on IOS-26? I’m trying to figure out the cause. I’ve sent feedback via the feedback app. Just seeing if anyone else has the same issue.
Replies
1
Boosts
0
Views
167
Activity
Jun ’25
Sharing a Swift port of Gemma 4 for mlx-swift-lm — feedback welcome
Hi all, I've been working on a pure-Swift port of Google's Gemma 4 text decoder that plugs into mlx-swift-lm as a sidecar model registration. Sharing it here in case anyone else hit the same wall I did, and to get feedback from the MLX team and the community before I propose anything upstream. Repo: https://github.com/yejingyang8963-byte/Swift-gemma4-core Why As of mlx-swift-lm 2.31.x, Gemma 4 isn't supported out of the box. The obvious workaround — reusing the Gemma 3 text implementation with a patched config — fails at weight load because Gemma 4 differs from Gemma 3 in several structural places. The chat-template path through swift-jinja 1.x also silently corrupts the prompt, so the model loads but generates incoherent text. What's in the package A from-scratch Swift implementation of the Gemma 4 decoder (Configuration, Layers, Attention, MLP, RoPE, DecoderLayer) Per-Layer Embedding (PLE) support — the shared embedding table that feeds every decoder layer through a gated MLP as a third residual KV sharing across the back half of the decoder, threaded through the forward pass via a donor table with a single global rope offset A custom Gemma4ProportionalRoPE class for the partial-rotation rope type that initializeRope doesn't currently recognize A chat-template bypass that builds the prompt as a literal string with the correct turn markers and encodes via tokenizer.encode(text:), matching Python mlx-lm's apply_chat_template byte-for-byte Measured on iPhone (A-series, 7.4 GB RAM) Model: mlx-community/gemma-4-e2b-it-4bit Warm load: ~6 s Memory after load: 341–392 MB Time to first token (end-to-end, 333-token system prompt): 2.82 s Generation throughput: 12–14 tok/s What I'd love feedback on Is the sidecar registration pattern the right way to extend mlx-swift-lm with new model families, or is there a more idiomatic path I missed? The chat-template bypass works but feels like a workaround. Is the right long-term fix in swift-jinja, in the tokenizer, or somewhere else entirely? Anyone running into the same PLE / KV-sharing issues on other Gemma-family checkpoints? I'd like to make sure the implementation generalizes beyond E2B before tagging a 0.2.0. Happy to open a PR against mlx-swift-lm if the maintainers think any of this belongs upstream. Thanks for reading.
Replies
1
Boosts
0
Views
315
Activity
Apr ’26
Using coremltools in a CI/CD pipeline
Hi everyone 👋 I'd like to use coremltools to see how well a model performs on a remote device as part of a CI/CD pipeline. According to the Core ML Tools "Debugging and Performance Utilities" guide, remote devices must be in a "connected" state in order for coremltools to install the ModelRunner application. The devices in our system have a "paired" state, and I'm unable to set the them as "connected." The only way I know how to connect a device is to physically plug it in to a computer and open Xcode. I don't have physical access to the devices in the CI/CD system, and the host computer that interacts with them doesn't have Xcode installed. Here are some questions I've been looking into and would love some help answering: Has anyone managed to use the coremltools performance utilities in a similar system? Can you put a device in a "connected" state if you don't have physical access to the device and if you only have access to Xcode command line tools and not the Xcode app? Is it at all possible to install the coremltools ModelRunner application on a "paired" device, for example, by manually building the app and installing it with devicectl? Would other utilities, such as the MLModelBenchmarker work as expected if the app is installed this way? Thank you!
Replies
1
Boosts
0
Views
579
Activity
Dec ’25
Visual Intelligence -- Make OpenIntent show a sheet rather than open my App
The developer tutorial for visual intelligence indicates that the method to detect and handle taps on a displayed entity from the Search section is via an "OpenIntent" associated with your entity. However, running this intent executes code from within my app. If I have the perform() method display UI, it always displays UI from within my app. I noticed that the Google app's integration to visual intelligence has a different behavior-- tapping on an entity does not take you to the Google app -- instead, a Webview is presented sheet-style WITHIN the Visual Intelligence environment (see below) How is that accomplished?
Replies
0
Boosts
0
Views
628
Activity
Sep ’25
Pre-inference AI Safety Governor for FoundationModels (Swift, On-Device)
Hi everyone, I've been building an on-device AI safety layer called Newton Engine, designed to validate prompts before they reach FoundationModels (or any LLM). Wanted to share v1.3 and get feedback from the community. The Problem Current AI safety is post-training — baked into the model, probabilistic, not auditable. When Apple Intelligence ships with FoundationModels, developers will need a way to catch unsafe prompts before inference, with deterministic results they can log and explain. What Newton Does Newton validates every prompt pre-inference and returns: Phase (0/1/7/8/9) Shape classification Confidence score Full audit trace If validation fails, generation is blocked. If it passes (Phase 9), the prompt proceeds to the model. v1.3 Detection Categories (14 total) Jailbreak / prompt injection Corrosive self-negation ("I hate myself") Hedged corrosive ("Not saying I'm worthless, but...") Emotional dependency ("You're the only one who understands") Third-person manipulation ("If you refuse, you're proving nobody cares") Logical contradictions ("Prove truth doesn't exist") Self-referential paradox ("Prove that proof is impossible") Semantic inversion ("Explain how truth can be false") Definitional impossibility ("Square circle") Delegated agency ("Decide for me") Hallucination-risk prompts ("Cite the 2025 CDC report") Unbounded recursion ("Repeat forever") Conditional unbounded ("Until you can't") Nonsense / low semantic density Test Results 94.3% catch rate on 35 adversarial test cases (33/35 passed). Architecture User Input ↓ [ Newton ] → Validates prompt, assigns Phase ↓ Phase 9? → [ FoundationModels ] → Response Phase 1/7/8? → Blocked with explanation Key Properties Deterministic (same input → same output) Fully auditable (ValidationTrace on every prompt) On-device (no network required) Native Swift / SwiftUI String Catalog localization (EN/ES/FR) FoundationModels-ready (#if canImport) Code Sample — Validation let governor = NewtonGovernor() let result = governor.validate(prompt: userInput) if result.permitted { // Proceed to FoundationModels let session = LanguageModelSession() let response = try await session.respond(to: userInput) } else { // Handle block print("Blocked: Phase \(result.phase.rawValue) — \(result.reasoning)") print(result.trace.summary) // Full audit trace } Questions for the Community Anyone else building pre-inference validation for FoundationModels? Thoughts on the Phase system (0/1/7/8/9) vs. simple pass/fail? Interest in Shape Theory classification for prompt complexity? Best practices for integrating with LanguageModelSession? Links GitHub: https://github.com/jaredlewiswechs/ada-newton Technical overview: parcri.net Happy to share more implementation details. Looking for feedback, collaborators, and anyone else thinking about deterministic AI safety on-device.
Replies
1
Boosts
0
Views
674
Activity
Jan ’26
CoreML Inference Acceleration
Hello everyone, I have a visual convolutional model and a video that has been decoded into many frames. When I perform inference on each frame in a loop, the speed is a bit slow. So, I started 4 threads, each running inference simultaneously, but I found that the speed is the same as serial inference, every single forward inference is slower. I used the mactop tool to check the GPU utilization, and it was only around 20%. Is this normal? How can I accelerate it?
Replies
2
Boosts
0
Views
758
Activity
Sep ’25