Foundation Models

RSS for tag

Discuss the Foundation Models framework which provides access to Apple’s on-device large language model that powers Apple Intelligence to help you perform intelligent tasks specific to your app.

Foundation Models Documentation

Posts under Foundation Models subtopic

Post

Replies

Boosts

Views

Activity

Creating an in-universe AI computer in my app
Last year after Apple foundation models framework was introduced, I begin working on a separate test Playground project to see how to use the foundation model framework to create an AI computer in my app that only has knowledge of in universe content that comes from within my app. Now with the OS 27 updates released I’m going back to work on that. I believe I can use the on-device system foundation model framework comfortably because I don’t think there’s a lot of content in my app that the AI has to know about Do you have any advice for using instructions to tell the model to focus on only the knowledge boundaries from within my app universe or might there be new tools this year in using foundation models framework that might help me achieve the limited knowledge scope that I want the AI to recognize and respond to for my app users.
1
0
42
1w
On Performance & Backgrounding
While we now know about the continued-processing.gpu entitlement for background tasks, is there a similar NPU-specific entitlement or priority flag to ensure that an on-device foundation model isn't preempted by system-level Apple Intelligence features while the app is in the background?
1
0
28
1w
Time Series Models
The Foundation Models framework is clearly designed around language, but there's a large class of on-device AI tasks that are not language tasks at all. Time series forecasting is one example think energy consumption modeling, or sensor anomaly detection. These models take sequences of numeric data and output probabilistic forecasts. No text involved at any layer. Is there any intention to extend Foundation Models or a sibling framework to non-language modalities specifically structured numeric and time series inference
1
1
53
1w
The standalone Siri app and cross-surface continuity
The new standalone Siri app keeps conversation history synced via iCloud across iPhone, iPad, and Mac. Can third-party content, results, or an app's agent surface appear inside the Siri app (e.g., as referenced sources or follow-up actions), and can the user deep-link from a Siri-app result back into the originating app with state intact? Is any conversation context from the Siri app exposed to a developer's intent when an action is invoked, so the app can act with the relevant context, and what are the privacy boundaries on that? When the same action is invoked from different surfaces (in-app, system Siri, the Siri app) and across synced devices, how should developers reason about execution location and idempotency to avoid duplicate side effects?
0
0
12
1w
LLM search using Core Spotlight
If your app creates an Apple Intelligence schema conforming App Entity, Siri AI can only reason over the schema defined properties. (see this thread). But as a developer, I can add more optional properties on my App Entity with additional metadata about the entity. If my app contributes these App Entities to Spotlight as indexed entities, is SpotlightSearchTool also limited to reasoning over just the schema defined properties, or are these unrelated concepts? Will these additional optional properties on my App Entity enable a deeper SpotlightSearchTool powered search experience around these entities?
2
0
179
1w
Visual Intelligence and screen/camera understanding for third-party apps
Visual Intelligence lets users ask Siri about what the camera or screen shows, and the screenshot tool can extract structured data into system apps. Can a third-party app contribute results or actions when the user invokes Visual Intelligence over the app's own content or a screenshot of it (analogous to how a schedule becomes calendar events), and what API surfaces that? For the Image Playground API, what are the content, rate, and style constraints, and can generated assets be used in commercial app contexts? Is there a supported way for an app to provide its own visual understanding to the system rather than relying solely on Apple's model — for domain-specific imagery the on-device model may not recognize?
1
0
67
1w
Improved Guardrails Error Handling
I work on an app called one sec which helps people reduce the amount of time they spend on social media by interrupting app openings with Shortcuts automations. With the release of iOS 26, we added a new Conversational Reflection interruption, a feature backed by Foundation Models. The user talks through their reasoning for wanting to use social media. A significant fraction of our users suffer from ADHD, ADD, and struggle with mental health. As a result, we try to show crisis support banners in our conversation UI. We do so with structured outputs, asking the language model If the user appears to be in deep distress. However, often times the guardrails are triggered and we don’t receive the response from the language model, meaning we can’t support our users and show them related resources if needed. We’d love to see more specific guardrails errors introduced in the framework to better support our users. Here’s a radar with further details: FB20828230 Thank you!
1
0
78
1w
Private Cloud Compute trust model across multiple cloud vendors
Reports indicate PCC now extends to NVIDIA hardware in Google Cloud datacenters, and the flagship cloud model is refined using Gemini outputs. Now that PCC spans infrastructure outside Apple's own datacenters, what attestation or verifiable transparency is available to developers and users about where a given request was processed, and do the original "data unreachable even by Apple" guarantees hold unchanged across all hardware vendors? For apps with enterprise or regulated users, is there documented data residency behavior for PCC and for third-party model routing, and any contractual/compliance posture (e.g., regional pinning) developers can rely on? Given the EU and China availability gaps at launch, what is the recommended graceful-degradation path for apps that must function in those regions — fall back to on-device only, to a developer-supplied provider, or disable AI features? Does routing to a third-party cloud provider through the framework carry the same PCC privacy guarantees, or are those guarantees specific to Apple's own cloud models?
0
0
10
1w
Confirmation, permissions, and reversibility for agentic actions
Apple demonstrated agentic behavior (e.g., the Passwords app changing credentials on the user's behalf), and Siri AI can now take systemwide actions in apps. Is there a first-class confirmation API for App Intents — a way to mark an action as requiring explicit user approval before execution, with a standard confirmation surface — or must developers build their own confirmation UI inside the intent? For irreversible or high-impact actions, what is Apple's recommended pattern to prevent the model from executing them autonomously, and can an intent declare a risk/sensitivity level the system respects? When Siri AI invokes an action, what authentication/authorization context is available to the intent (biometric gate, user-presence assertion), and how should an app require step-up auth for sensitive operations? Is there a supported audit trail for actions taken via Siri AI on the user's behalf, so an app can show the user what was done and when? How does the system handle an action that fails or partially completes during an agentic, multi-step flow?
1
1
91
1w
Can the SpotlightSearchTool work with a custom model executor?
When SpotlightSearchTool is used with a custom LanguageModel backend (for example Apple’s ChatCompletionsLanguageModel from apple/foundation-models-utilities, pointed at an OpenAI-compatible server), the tool can never be successfully invoked. The model produces tool-call arguments that exactly match the format documented in the tool’s own description, but those arguments fail validation against the tool’s generated parameters JSON Schema, throwing LanguageModelSession.ToolCallError with underlying error “Failed to parse generated content.” The root cause is a mismatch between two things the framework sends to the model in the same tool definition: the human-readable description (“Call format”), which presents the top-level arguments as { root, modelComposition, … }, and the parameters JSON Schema (FullArguments), which requires { "query": { "type": "search", "value": { root, modelComposition, … } } }. A model that follows the description is guaranteed to fail the schema. Secondary observation (may be a separate issue or intended) CoreSpotlightSource.fetchAttributes appears to have no effect on which attributes are returned to the model on this agentic-search path. Even with fetchAttributes: [.title, .contentDescription] set on the source, results contain only default metadata (kMDItemTitle, kMDItemDisplayName, dates, identifiers) and omit kMDItemDescription. The description is returned only when the in-query SearchArguments.fetchAttributes explicitly lists it. The searchableIndexDelegate was never invoked in any configuration tried (including .dynamic). If the source-level fetchAttributes is meant to drive returned attributes, that also seems incorrect; otherwise, clarifying the docs would help. Therefore my question, is this just not supported or does the scheme need an update? Or is There a different way that should be done?
1
0
80
1w
Structured intents vs free-form queries
For voice assistants with many capabilities, is it better to ship one generic ‘ask assistant’ intent with a natural-language parameter, or many typed intents (GetForecast, CompareLocations, etc.)? What are Siri’s limits on disambiguation and follow-up turns?
Replies
1
Boosts
0
Views
45
Activity
1w
Siri without opening the app
Can App Intents perform authenticated backend calls (Bearer token in Keychain / App Group) and return structured results to Siri, or must execution always launch the host app first?
Replies
1
Boosts
0
Views
46
Activity
1w
Creating an in-universe AI computer in my app
Last year after Apple foundation models framework was introduced, I begin working on a separate test Playground project to see how to use the foundation model framework to create an AI computer in my app that only has knowledge of in universe content that comes from within my app. Now with the OS 27 updates released I’m going back to work on that. I believe I can use the on-device system foundation model framework comfortably because I don’t think there’s a lot of content in my app that the AI has to know about Do you have any advice for using instructions to tell the model to focus on only the knowledge boundaries from within my app universe or might there be new tools this year in using foundation models framework that might help me achieve the limited knowledge scope that I want the AI to recognize and respond to for my app users.
Replies
1
Boosts
0
Views
42
Activity
1w
React Native + native AI bridge
What’s the supported integration path for Foundation Models and Apple Intelligence from a React Native app — thin Swift native module, App Intents only, or are these features effectively Swift-first?
Replies
2
Boosts
0
Views
31
Activity
1w
Privacy, personalization, and App Store expectations
We offer both cloud-based AI (subscription) and are exploring on-device Apple Intelligence features. What user profile data is appropriate to inject into on-device model sessions under Apple’s privacy guidelines, and how should apps disclose hybrid cloud + on-device AI in privacy nutrition labels and review?
Replies
1
Boosts
0
Views
41
Activity
1w
On Performance & Backgrounding
While we now know about the continued-processing.gpu entitlement for background tasks, is there a similar NPU-specific entitlement or priority flag to ensure that an on-device foundation model isn't preempted by system-level Apple Intelligence features while the app is in the background?
Replies
1
Boosts
0
Views
28
Activity
1w
Time Series Models
The Foundation Models framework is clearly designed around language, but there's a large class of on-device AI tasks that are not language tasks at all. Time series forecasting is one example think energy consumption modeling, or sensor anomaly detection. These models take sequences of numeric data and output probabilistic forecasts. No text involved at any layer. Is there any intention to extend Foundation Models or a sibling framework to non-language modalities specifically structured numeric and time series inference
Replies
1
Boosts
1
Views
53
Activity
1w
Speech recognition with large, dynamic vocabularies
Our users speak proper nouns and domain terms (place names, product jargon) that change frequently. What’s the best practice for improving recognition accuracy: dynamic contextual strings, on-device custom language resources, periodic vocabulary sync, or something else in the current Speech APIs?
Replies
1
Boosts
0
Views
30
Activity
1w
Hobbyist Eligibility for App Store Small Business Program
As a hobbyist developer, I develop apps mostly for my own use, and want to try using private cloud compute. As per https://developer.apple.com/private-cloud-compute I have applied for the App Store Small Business Program. Am I eligible for this and for pcc access if I have no apps in the App Store? I do have one in Test Flight.
Replies
0
Boosts
1
Views
50
Activity
1w
The standalone Siri app and cross-surface continuity
The new standalone Siri app keeps conversation history synced via iCloud across iPhone, iPad, and Mac. Can third-party content, results, or an app's agent surface appear inside the Siri app (e.g., as referenced sources or follow-up actions), and can the user deep-link from a Siri-app result back into the originating app with state intact? Is any conversation context from the Siri app exposed to a developer's intent when an action is invoked, so the app can act with the relevant context, and what are the privacy boundaries on that? When the same action is invoked from different surfaces (in-app, system Siri, the Siri app) and across synced devices, how should developers reason about execution location and idempotency to avoid duplicate side effects?
Replies
0
Boosts
0
Views
12
Activity
1w
LLM search using Core Spotlight
If your app creates an Apple Intelligence schema conforming App Entity, Siri AI can only reason over the schema defined properties. (see this thread). But as a developer, I can add more optional properties on my App Entity with additional metadata about the entity. If my app contributes these App Entities to Spotlight as indexed entities, is SpotlightSearchTool also limited to reasoning over just the schema defined properties, or are these unrelated concepts? Will these additional optional properties on my App Entity enable a deeper SpotlightSearchTool powered search experience around these entities?
Replies
2
Boosts
0
Views
179
Activity
1w
Tool calling: App Intents vs server-side orchestration
For assistants that need multi-step tool use (search → fetch → compare → respond), should third-party apps expose capabilities as App Intents for on-device model selection, or keep tool orchestration on the server and use on-device models only for speech and summarization? What breaks when the same action exists in both places?
Replies
1
Boosts
0
Views
67
Activity
1w
RAG support
What kind out-of-box on-device RAG support exists in the foundation models framework? (vector DBs, embedding methods etc., agentic RAG hooks?)
Replies
0
Boosts
0
Views
30
Activity
1w
Hybrid assistant architecture (on-device model + server tools)
We run a conversational assistant where answers depend on live API data, not just static knowledge. What is Apple’s recommended split between on-device Foundation Models (intent, routing, summarization, privacy-sensitive context) and server-side tool execution? Is there an official pattern for a local planner with a remote executor?
Replies
0
Boosts
0
Views
15
Activity
1w
Visual Intelligence and screen/camera understanding for third-party apps
Visual Intelligence lets users ask Siri about what the camera or screen shows, and the screenshot tool can extract structured data into system apps. Can a third-party app contribute results or actions when the user invokes Visual Intelligence over the app's own content or a screenshot of it (analogous to how a schedule becomes calendar events), and what API surfaces that? For the Image Playground API, what are the content, rate, and style constraints, and can generated assets be used in commercial app contexts? Is there a supported way for an app to provide its own visual understanding to the system rather than relying solely on Apple's model — for domain-specific imagery the on-device model may not recognize?
Replies
1
Boosts
0
Views
67
Activity
1w
Improved Guardrails Error Handling
I work on an app called one sec which helps people reduce the amount of time they spend on social media by interrupting app openings with Shortcuts automations. With the release of iOS 26, we added a new Conversational Reflection interruption, a feature backed by Foundation Models. The user talks through their reasoning for wanting to use social media. A significant fraction of our users suffer from ADHD, ADD, and struggle with mental health. As a result, we try to show crisis support banners in our conversation UI. We do so with structured outputs, asking the language model If the user appears to be in deep distress. However, often times the guardrails are triggered and we don’t receive the response from the language model, meaning we can’t support our users and show them related resources if needed. We’d love to see more specific guardrails errors introduced in the framework to better support our users. Here’s a radar with further details: FB20828230 Thank you!
Replies
1
Boosts
0
Views
78
Activity
1w
Is AFM 3 Core a CoreAI model?
Are the on-device Apple foundation models like AFM 3 Core shipped as CoreAI models or do they use some different technology? Is it possible to open them in the Core AI Debugger to understand them in detail?
Replies
1
Boosts
0
Views
122
Activity
1w
Private Cloud Compute trust model across multiple cloud vendors
Reports indicate PCC now extends to NVIDIA hardware in Google Cloud datacenters, and the flagship cloud model is refined using Gemini outputs. Now that PCC spans infrastructure outside Apple's own datacenters, what attestation or verifiable transparency is available to developers and users about where a given request was processed, and do the original "data unreachable even by Apple" guarantees hold unchanged across all hardware vendors? For apps with enterprise or regulated users, is there documented data residency behavior for PCC and for third-party model routing, and any contractual/compliance posture (e.g., regional pinning) developers can rely on? Given the EU and China availability gaps at launch, what is the recommended graceful-degradation path for apps that must function in those regions — fall back to on-device only, to a developer-supplied provider, or disable AI features? Does routing to a third-party cloud provider through the framework carry the same PCC privacy guarantees, or are those guarantees specific to Apple's own cloud models?
Replies
0
Boosts
0
Views
10
Activity
1w
Confirmation, permissions, and reversibility for agentic actions
Apple demonstrated agentic behavior (e.g., the Passwords app changing credentials on the user's behalf), and Siri AI can now take systemwide actions in apps. Is there a first-class confirmation API for App Intents — a way to mark an action as requiring explicit user approval before execution, with a standard confirmation surface — or must developers build their own confirmation UI inside the intent? For irreversible or high-impact actions, what is Apple's recommended pattern to prevent the model from executing them autonomously, and can an intent declare a risk/sensitivity level the system respects? When Siri AI invokes an action, what authentication/authorization context is available to the intent (biometric gate, user-presence assertion), and how should an app require step-up auth for sensitive operations? Is there a supported audit trail for actions taken via Siri AI on the user's behalf, so an app can show the user what was done and when? How does the system handle an action that fails or partially completes during an agentic, multi-step flow?
Replies
1
Boosts
1
Views
91
Activity
1w
Can the SpotlightSearchTool work with a custom model executor?
When SpotlightSearchTool is used with a custom LanguageModel backend (for example Apple’s ChatCompletionsLanguageModel from apple/foundation-models-utilities, pointed at an OpenAI-compatible server), the tool can never be successfully invoked. The model produces tool-call arguments that exactly match the format documented in the tool’s own description, but those arguments fail validation against the tool’s generated parameters JSON Schema, throwing LanguageModelSession.ToolCallError with underlying error “Failed to parse generated content.” The root cause is a mismatch between two things the framework sends to the model in the same tool definition: the human-readable description (“Call format”), which presents the top-level arguments as { root, modelComposition, … }, and the parameters JSON Schema (FullArguments), which requires { "query": { "type": "search", "value": { root, modelComposition, … } } }. A model that follows the description is guaranteed to fail the schema. Secondary observation (may be a separate issue or intended) CoreSpotlightSource.fetchAttributes appears to have no effect on which attributes are returned to the model on this agentic-search path. Even with fetchAttributes: [.title, .contentDescription] set on the source, results contain only default metadata (kMDItemTitle, kMDItemDisplayName, dates, identifiers) and omit kMDItemDescription. The description is returned only when the in-query SearchArguments.fetchAttributes explicitly lists it. The searchableIndexDelegate was never invoked in any configuration tried (including .dynamic). If the source-level fetchAttributes is meant to drive returned attributes, that also seems incorrect; otherwise, clarifying the docs would help. Therefore my question, is this just not supported or does the scheme need an update? Or is There a different way that should be done?
Replies
1
Boosts
0
Views
80
Activity
1w