Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.

All subtopics
Posts under Machine Learning & AI topic

Post

Replies

Boosts

Views

Activity

A Summary of the WWDC25 Group Lab - Machine Learning and AI Frameworks
At WWDC25 we launched a new type of Lab event for the developer community - Group Labs. A Group Lab is a panel Q&A designed for a large audience of developers. Group Labs are a unique opportunity for the community to submit questions directly to a panel of Apple engineers and designers. Here are the highlights from the WWDC25 Group Lab for Machine Learning and AI Frameworks. What are you most excited about in the Foundation Models framework? The Foundation Models framework provides access to an on-device Large Language Model (LLM), enabling entirely on-device processing for intelligent features. This allows you to build features such as personalized search suggestions and dynamic NPC generation in games. The combination of guided generation and streaming capabilities is particularly exciting for creating delightful animations and features with reliable output. The seamless integration with SwiftUI and the new design material Liquid Glass is also a major advantage. When should I still bring my own LLM via CoreML? It's generally recommended to first explore Apple's built-in system models and APIs, including the Foundation Models framework, as they are highly optimized for Apple devices and cover a wide range of use cases. However, Core ML is still valuable if you need more control or choice over the specific model being deployed, such as customizing existing system models or augmenting prompts. Core ML provides the tools to get these models on-device, but you are responsible for model distribution and updates. Should I migrate PyTorch code to MLX? MLX is an open-source, general-purpose machine learning framework designed for Apple Silicon from the ground up. It offers a familiar API, similar to PyTorch, and supports C, C++, Python, and Swift. MLX emphasizes unified memory, a key feature of Apple Silicon hardware, which can improve performance. It's recommended to try MLX and see if its programming model and features better suit your application's needs. MLX shines when working with state-of-the-art, larger models. Can I test Foundation Models in Xcode simulator or device? Yes, you can use the Xcode simulator to test Foundation Models use cases. However, your Mac must be running macOS Tahoe. You can test on a physical iPhone running iOS 18 by connecting it to your Mac and running Playgrounds or live previews directly on the device. Which on-device models will be supported? any open source models? The Foundation Models framework currently supports Apple's first-party models only. This allows for platform-wide optimizations, improving battery life and reducing latency. While Core ML can be used to integrate open-source models, it's generally recommended to first explore the built-in system models and APIs provided by Apple, including those in the Vision, Natural Language, and Speech frameworks, as they are highly optimized for Apple devices. For frontier models, MLX can run very large models. How often will the Foundational Model be updated? How do we test for stability when the model is updated? The Foundation Model will be updated in sync with operating system updates. You can test your app against new model versions during the beta period by downloading the beta OS and running your app. It is highly recommended to create an "eval set" of golden prompts and responses to evaluate the performance of your features as the model changes or as you tweak your prompts. Report any unsatisfactory or satisfactory cases using Feedback Assistant. Which on-device model/API can I use to extract text data from images such as: nutrition labels, ingredient lists, cashier receipts, etc? Thank you. The Vision framework offers the RecognizeDocumentRequest which is specifically designed for these use cases. It not only recognizes text in images but also provides the structure of the document, such as rows in a receipt or the layout of a nutrition label. It can also identify data like phone numbers, addresses, and prices. What is the context window for the model? What are max tokens in and max tokens out? The context window for the Foundation Model is 4,096 tokens. The split between input and output tokens is flexible. For example, if you input 4,000 tokens, you'll have 96 tokens remaining for the output. The API takes in text, converting it to tokens under the hood. When estimating token count, a good rule of thumb is 3-4 characters per token for languages like English, and 1 character per token for languages like Japanese or Chinese. Handle potential errors gracefully by asking for shorter prompts or starting a new session if the token limit is exceeded. Is there a rate limit for Foundation Models API that is limited by power or temperature condition on the iPhone? Yes, there are rate limits, particularly when your app is in the background. A budget is allocated for background app usage, but exceeding it will result in rate-limiting errors. In the foreground, there is no rate limit unless the device is under heavy load (e.g., camera open, game mode). The system dynamically balances performance, battery life, and thermal conditions, which can affect the token throughput. Use appropriate quality of service settings for your tasks (e.g., background priority for background work) to help the system manage resources effectively. Do the foundation models support languages other than English? Yes, the on-device Foundation Model is multilingual and supports all languages supported by Apple Intelligence. To get the model to output in a specific language, prompt it with instructions indicating the user's preferred language using the locale API (e.g., "The user's preferred language is en-US"). Putting the instructions in English, but then putting the user prompt in the desired output language is a recommended practice. Are larger server-based models available through Foundation Models? No, the Foundation Models API currently only provides access to the on-device Large Language Model at the core of Apple Intelligence. It does not support server-side models. On-device models are preferred for privacy and for performance reasons. Is it possible to run Retrieval-Augmented Generation (RAG) using the Foundation Models framework? Yes, it is possible to run RAG on-device, but the Foundation Models framework does not include a built-in embedding model. You'll need to use a separate database to store vectors and implement nearest neighbor or cosine distance searches. The Natural Language framework offers simple word and sentence embeddings that can be used. Consider using a combination of Foundation Models and Core ML, using Core ML for your embedding model.
1
0
915
Jun ’25
Metal recommendedMaxWorkingSetSize vs actual RAM on iPhone (LLM load fails)
Context I’m deploying large language models on iPhone using llama.cpp. A new iPhone Air (12 GB RAM) reports a Metal MTLDevice.recommendedMaxWorkingSetSize of 8,192 MB, and my attempt to load Llama-2-13B Q4_K (~7.32 GB weights) fails during model initialization. Environment Device: iPhone Air (12 GB RAM) iOS: 26 Xcode: 26.0.1 Build: Metal backend enabled llama.cpp App runs on device (not Simulator) What I’m seeing MTLCreateSystemDefaultDevice().recommendedMaxWorkingSetSize == 8192 MiB Loading Llama-2-13B Q4_K (7.32 GB) fails to complete. Logs indicate memory pressure / allocation issues consistent with the 8 GB working-set guidance. Smaller models (e.g., 7B/8B with similar quantization) load and run (8B Q4_K provide around 9 tokens/second decoding speed). Questions Is 8,192 MB an expected recommendedMaxWorkingSetSize on a 12 GB iPhone? What values should I expect on other 2025 devices including iPhone 17 (8 GB RAM) and iPhone 17 Pro (12 GB RAM) Is it strictly enforced by Metal allocations (heaps/buffers), or advisory for best performance/eviction behavior? Can a process practically exceed this for long-lived buffers without immediate Jetsam risk? Any guidance for LLM scenarios near the limit?
0
0
97
13h
Model Guardrails Too Restrictive?
I'm experimenting with using the Foundation Models framework to do news summarization in an RSS app but I'm finding that a lot of articles are getting kicked back with a vague message about guardrails. This seems really common with political news but we're talking mainstream stuff, i.e. Politico, etc. If the models are this restrictive, this will be tough to use. Is this intended? FB17904424
8
4
343
13h
tensorflow-metal fails with tensorflow > 2.18.1
Also submitted as feedback (ID: FB20612561). Tensorflow-metal fails on tensorflow versions above 2.18.1, but works fine on tensorflow 2.18.1 In a new python 3.12 virtual environment: pip install tensorflow pip install tensor flow-metal python -c "import tensorflow as tf" Prints error: Traceback (most recent call last): File "", line 1, in File "/Users//pt/venv/lib/python3.12/site-packages/tensorflow/init.py", line 438, in _ll.load_library(_plugin_dir) File "/Users//pt/venv/lib/python3.12/site-packages/tensorflow/python/framework/load_library.py", line 151, in load_library py_tf.TF_LoadLibrary(lib) tensorflow.python.framework.errors_impl.NotFoundError: dlopen(/Users//pt/venv/lib/python3.12/site-packages/tensorflow-plugins/libmetal_plugin.dylib, 0x0006): Library not loaded: @rpath/_pywrap_tensorflow_internal.so Referenced from: <8B62586B-B082-3113-93AB-FD766A9960AE> /Users//pt/venv/lib/python3.12/site-packages/tensorflow-plugins/libmetal_plugin.dylib Reason: tried: '/Users//pt/venv/lib/python3.12/site-packages/tensorflow-plugins/../_solib_darwin_arm64/_U@local_Uconfig_Utf_S_S_C_Upywrap_Utensorflow_Uinternal___Uexternal_Slocal_Uconfig_Utf/_pywrap_tensorflow_internal.so' (no such file), '/Users//pt/venv/lib/python3.12/site-packages/tensorflow-plugins/../_solib_darwin_arm64/_U@local_Uconfig_Utf_S_S_C_Upywrap_Utensorflow_Uinternal___Uexternal_Slocal_Uconfig_Utf/_pywrap_tensorflow_internal.so' (no such file), '/opt/homebrew/lib/_pywrap_tensorflow_internal.so' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/opt/homebrew/lib/_pywrap_tensorflow_internal.so' (no such file)
3
2
1.1k
1d
CoreML regression between macOS 26.0.1 and macOS 26.1 Beta causing scrambled tensor outputs
We’ve encountered what appears to be a CoreML regression between macOS 26.0.1 and macOS 26.1 Beta. In macOS 26.0.1, CoreML models run and produce correct results. However, in macOS 26.1 Beta, the same models produce scrambled or corrupted outputs, suggesting that tensor memory is being read or written incorrectly. The behavior is consistent with a low-level stride or pointer arithmetic issue — for example, using 16-bit strides on 32-bit data or other mismatches in tensor layout handling. Reproduction Install ON1 Photo RAW 2026 or ON1 Resize 2026 on macOS 26.0.1. Use the newest Highest Quality resize model, which is Stable Diffusion–based and runs through CoreML. Observe correct, high-quality results. Upgrade to macOS 26.1 Beta and run the same operation again. The output becomes visually scrambled or corrupted. We are also seeing similar issues with another Stable Diffusion UNet model that previously worked correctly on macOS 26.0.1. This suggests the regression may affect multiple diffusion-style architectures, likely due to a change in CoreML’s tensor stride, layout computation, or memory alignment between these versions. Notes The affected models are exported using standard CoreML conversion pipelines. No custom operators or third-party CoreML runtime layers are used. The issue reproduces consistently across multiple machines. It would be helpful to know if there were changes to CoreML’s tensor layout, precision handling, or MLCompute backend between macOS 26.0.1 and 26.1 Beta, or if this is a known regression in the current beta.
0
0
400
4d
Missing module 'coremltools.libmilstoragepython'
Hello! I'm following the Foundation Models adapter training guide (https://developer.apple.com/apple-intelligence/foundation-models-adapter/) on my NVIDIA DGX Spark box. I'm able to train on my own data but the example notebook fails when I try to export the artifact as an fmadapter. I get the following error for the code block I'm trying to run. I haven't touched any of the code in the export folder. I tried exporting it on my Mac too and got the same error as well (given below). Would appreciate some more clarity around this. Thank you. Code Block: from export.export_fmadapter import Metadata, export_fmadapter metadata = Metadata( author="3P developer", description="An adapter that writes play scripts.", ) export_fmadapter( output_dir="./", adapter_name="myPlaywritingAdapter", metadata=metadata, checkpoint="adapter-final.pt", draft_checkpoint="draft-model-final.pt", ) Error: --------------------------------------------------------------------------- ModuleNotFoundError Traceback (most recent call last) Cell In[10], line 1 ----> 1 from export.export_fmadapter import Metadata, export_fmadapter 3 metadata = Metadata( 4 author="3P developer", 5 description="An adapter that writes play scripts.", 6 ) 8 export_fmadapter( 9 output_dir="./", 10 adapter_name="myPlaywritingAdapter", (...) 13 draft_checkpoint="draft-model-final.pt", 14 ) File /workspace/export/export_fmadapter.py:11 8 from typing import Any 10 from .constants import BASE_SIGNATURE, MIL_PATH ---> 11 from .export_utils import AdapterConverter, AdapterSpec, DraftModelConverter, camelize 13 logger = logging.getLogger(__name__) 16 class MetadataKeys(enum.StrEnum): File /workspace/export/export_utils.py:15 13 import torch 14 import yaml ---> 15 from coremltools.libmilstoragepython import _BlobStorageWriter as BlobWriter 16 from coremltools.models.neural_network.quantization_utils import _get_kmeans_lookup_table_and_weight 17 from coremltools.optimize._utils import LutParams ModuleNotFoundError: No module named 'coremltools.libmilstoragepython'
4
0
434
5d
Core ML Model Performance report shows prediction speed much faster than actual app runs
Hi all, I'm tuning my app prediction speed with Core ML model. I watched and tried the methods in video: Improve Core ML integration with async prediction and Optimize your Core ML usage. I also use instruments to look what's the bottleneck that my prediction speed cannot be faster. Below is the instruments result with my app. its prediction duration is 10.29ms And below is performance report shows the average speed of prediction is 5.55ms, that is about half time of my app prediction! Below is part of my instruments records. I think the prediction should be considered quite frequent. Could it be faster? How to be the same prediction speed as performance report? The prediction speed on macbook Pro M2 is nearly the same as macbook Air M1!
5
0
1.1k
1w
Using #Preview with a PartialyGenerated model
I have an app that streams in data from the Foundation Model and I have a card that shows one of the outputs. I want my card to accept a partially generated model but I keep getting a nonsensical error. The error I get on line 59 is: Cannot convert value of type 'FrostDate.VegetableSuggestion.PartiallyGenerated' (aka 'FrostDate.VegetableSuggestion') to expected argument type 'FrostDate.VegetableSuggestion.PartiallyGenerated' Here is my card with preview: import SwiftUI import FoundationModels struct VegetableSuggestionCard: View { let vegetableSuggestion: VegetableSuggestion.PartiallyGenerated init(vegetableSuggestion: VegetableSuggestion.PartiallyGenerated) { self.vegetableSuggestion = vegetableSuggestion } var body: some View { VStack(alignment: .leading, spacing: 8) { if let name = vegetableSuggestion.vegetableName { Text(name) .font(.headline) .frame(maxWidth: .infinity, alignment: .leading) } if let startIndoors = vegetableSuggestion.startSeedsIndoors { Text("Start indoors: \(startIndoors)") .frame(maxWidth: .infinity, alignment: .leading) } if let startOutdoors = vegetableSuggestion.startSeedsOutdoors { Text("Start outdoors: \(startOutdoors)") .frame(maxWidth: .infinity, alignment: .leading) } if let transplant = vegetableSuggestion.transplantSeedlingsOutdoors { Text("Transplant: \(transplant)") .frame(maxWidth: .infinity, alignment: .leading) } if let tips = vegetableSuggestion.tips { Text("Tips: \(tips)") .foregroundStyle(.secondary) .frame(maxWidth: .infinity, alignment: .leading) } } .padding(16) .frame(maxWidth: .infinity, alignment: .leading) .background( RoundedRectangle(cornerRadius: 16, style: .continuous) .fill(.background) .overlay( RoundedRectangle(cornerRadius: 16, style: .continuous) .strokeBorder(.quaternary, lineWidth: 1) ) .shadow(color: Color.black.opacity(0.05), radius: 6, x: 0, y: 2) ) } } #Preview("Vegetable Suggestion Card") { let sample = VegetableSuggestion.PartiallyGenerated( vegetableName: "Tomato", startSeedsIndoors: "6–8 weeks before last frost", startSeedsOutdoors: "After last frost when soil is warm", transplantSeedlingsOutdoors: "1–2 weeks after last frost", tips: "Harden off seedlings; provide full sun and consistent moisture." ) VegetableSuggestionCard(vegetableSuggestion: sample) .padding() .previewLayout(.sizeThatFits) }
1
0
33
1w
Data used for MLX fine-tuning
The WWDC25: Explore large language models on Apple silicon with MLX video talks about using your own data to fine-tune a large language model. But the video doesn't explain what kind of data can be used. The video just shows the command to use and how to point to the data folder. Can I use PDFs, Word documents, Markdown files to train the model? Are there any code examples on GitHub that demonstrate how to do this?
2
0
139
1w
Avoid hallucinations and information from trainning data
Hi For certain tasks, such as qualitative analysis or tagging, it is advisable to provide the AI with the option to respond with a joker / wild card answer when it encounters difficulties in tagging or scoring. For instance, you can include this slot in the prompt as follows: output must be "not data to score" when there isn't information to score. In the absence of these types of slots, AI trends to provide a solution even when there is insufficient information. Foundations Models are told to be prompted with simple prompts. I wonder: Is recommended keep this slot though adds verbose complexity? Is the best place the comment of a guided attribute? other tips? Another use case is when you want the AI to be tied to the information provided in the prompt and not take information from its data set. What is the best approach to this purpose? Thanks in advance for any suggestion.
4
0
736
1w
Siri 2.0 (suggests and future updates)
Hey dear developers! This post should be available for the future Siri updates and improvements but also for wishes in this forum so that everyone can share their opinion and idea please stay friendly. have fun! I had already thought about developing a demo app to demonstrate my idea for a better Siri. My change of many: Wish Update: Siri's language recognition capabilities have been significantly enhanced. Instead of manually setting the language, Siri can now automatically recognize the language you intend to use, making language switching much more efficient. Simply speak the language you want to communicate in, and Siri will automatically recognize it and respond accordingly. Whether you speak English, German, or Japanese, Siri will respond in the language you choose.
1
1
735
2w
Visual Intelligence API SemanticContentDescriptor labels are empty
I'm trying to use Apple's new Visual Intelligence API for recommending content through screenshot image search. The problem I encountered is that the SemanticContentDescriptor labels are either completely empty or super misleading, making it impossible to query for similar content on my app. Even the closest matching example was inaccurate, returning a single label ["cardigan"] for a Supreme T-Shirt. I see other apps using this API like Etsy for example, and I'm wondering if they're using the input pixel buffer to query for similar content rather than using the labels? If anyone has a similar experience or something that wasn't called out in the documentation please lmk! Thanks.
1
0
258
2w
Foundation Models inside of DeviceActivityReport?
Pretty much as per the title and I suspect I know the answer. Given that Foundation Models run on device, is it possible to use Foundation Models framework inside of a DeviceActivityReport? I've been tinkering with it, and all I get is errors and "Sandbox restrictions". Am I missing something? Seems like a missed trick to utilise on device AI/ML with other frameworks.
1
0
350
2w
Updated DetectHandPoseRequest revision from WWDC25 doesn't exist
I watched this year WWDC25 "Read Documents using the Vision framework". At the end of video there is mention of new DetectHandPoseRequest model for hand pose detection in Vision API. I looked Apple documentation and I don't see new revision. Moreover probably typo in video because there is only DetectHumanPoseRequst (swift based) and VNDetectHumanHandPoseRequest (obj-c based) (notice lack of Human prefix in WWDC video) First one have revision only added in iOS 18+: https://developer.apple.com/documentation/vision/detecthumanhandposerequest/revision-swift.enum/revision1 Second one have revision only added in iOS14+: https://developer.apple.com/documentation/vision/vndetecthumanhandposerequestrevision1 I don't see any new revision targeting iOS26+
0
0
74
2w