Questions About Apple Foundation Models, Context Window Limits and the New Core AI Framework

After reviewing the WWDC sessions on Foundation Models and Core AI, I had a few questions around the practical limits and architectural direction of the platform.

From my understanding, on-device Foundation Models remain optimized for privacy, latency, and efficiency, which naturally introduces constraints around context length and agent complexity. Has anything changed regarding the effective context window available to developers, or should we still design around similar context-management constraints as before?

Core AI appears to introduce a more structured approach to building AI-powered applications. For developers building sophisticated assistants, how should we think about the boundary between application-level orchestration and framework-level orchestration? For example, are advanced patterns such as sub-agents, hierarchical planning, dynamic tool availability, and workflow decomposition expected to remain developer-managed, or are these areas Core AI aims to support more directly over time?

I am also curious about Apple's vision for model interoperability. While Foundation Models provide an excellent on-device experience, many production-grade agent systems combine multiple specialized models for planning, reasoning, retrieval, and execution. Does Apple envision future pathways for integrating external models into Core AI driven workflows while maintaining the privacy and performance principles of the platform?

Finally, for teams pushing the limits of on-device AI assistants, what architectural patterns do you recommend for handling long-horizon tasks, large context requirements, evolving toolsets, and multi-step reasoning within the current Foundation Models ecosystem?

Questions About Apple Foundation Models, Context Window Limits and the New Core AI Framework
 
 
Q