I am follwing this tutorial:
https://apple.github.io/coremltools/docs-guides/source/convert-a-torchvision-model-from-pytorch.html
I have obtained simialr result using the python code.
However when I view it in Xcode, the preview prediction percentage confidence is way off I suspect it is due the the output of the model, which is in percentage already and in Xcode it multiply 100 again leading to this result. Please give me any feedback to fix this, thank you.
Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
Hi, guys. I'm writing about Apple Intelligence and I reached the point I have to explain App Intent Domains
https://developer.apple.com/documentation/AppIntents/app-intent-domains
but I noticed that there is a note explaining that these services are not available with Siri. I tried the example provided by Apple at
https://developer.apple.com/documentation/AppIntents/making-your-app-s-functionality-available-to-siri
and I can only make the intents work from the Shortcuts App, but not from Siri.
Is this correct. App Intent Domains are still not available with Siri?
Thanks
Topic:
Machine Learning & AI
SubTopic:
Apple Intelligence
Hello, I'm using videotoolbox superresolution API in MACOS 26: https://developer.apple.com/documentation/videotoolbox/vtsuperresolutionscalerconfiguration/downloadconfigurationmodel(completionhandler:)?language=objc, when using swift, it's ok, when using objective-c, I get error when downloading model with downloadConfigurationModelWithCompletionHandler:
[Auto] MA-auto{_failedLockContent} | failure reported by server | error:[com.apple.MobileAssetError.AutoAsset:MissingReference(6111)]
[Auto] MA-auto{_failedLockContent} | failure reported by server | error:[com.apple.MobileAssetError.AutoAsset:UnderlyingError(6107)_1_com.apple.MobileAssetError.Download:47]
Download completion handler called with error: The operation couldnxe2x80x99t be completed. (VTFrameProcessorErrorDomain error -19743.)
Is foundation models matured enough to take input from the Apple Vision framework to generate responses? Something similar to what google's gemini does although in a much smaller scale and for a very specific niche.
Hi, I'm currently using Metal Performance Shaders Graph (MPSGraphExecutable) to run neural network inference operations as part of a metal rendering pipeline.
I also tried to profile the usage of neural engine when running inference using MPSGraphExecutable but the graph shows no sign of neural engine usage. However, when I used the coreML model inspection tool in xcode and run performance report, it was able to use ANE.
Does MPSGraphExecutable automatically utilize the Apple Neural Engine (ANE) when running inference operations, or does it only execute on GPU?
My model (Core ML Package) was converted from a pytouch model using coremltools with ML program type and support iOS17.0+.
Any insights or documentation references would be greatly appreciated!
I'm on Tahoe 26.1 / M3 Macbook Air. I'm using VNDetectFaceRectanglesRequest as properly as possible, as in the minimal command line program attached below. For some reason, I always get:
MLE5Engine is disabled through the configuration
printed. I couldn't find any notes on developer docs saying that VNDetectFaceRectanglesRequest can not use the Apple Neural Engine. I'm assuming there is something wrong with my code however I wasn't able to find any remarks from documentation where it might be. I wasn't able to find the above error message online either. I would appreciate your help a lot and thank you in advance.
The code below accesses the video from AVCaptureDevice.DeviceType.builtInWideAngleCamera. Currently it directly chooses the 0th format which has the largest resolution (Full HD on my M3 MBA) and "4:2:0" color "v" reduced color component spectrum encoding ("420v").
After accessing video, it performs a VNDetectFaceRectanglesRequest. It prints "VNDetectFaceRectanglesRequest completion Handler called" many times, then prints the error message above, then continues printing "VNDetectFaceRectanglesRequest completion Handler called" until the user quits it.
To run it in Xcode, File > New project > Mac command line tool. Pasting the code below, then click on the root file > Targets > Signing & Capabilities > Hardened Runtime > Resource Access > Camera.
A possible explanation could be that either Apple's internal CoreML code for this function works on GPU/CPU only or it doesn't accept 420v as supplied by the Macbook Air camera
import AVKit
import Vision
var videoDataOutput: AVCaptureVideoDataOutput = AVCaptureVideoDataOutput()
var detectionRequests: [VNDetectFaceRectanglesRequest]?
var videoDataOutputQueue: DispatchQueue = DispatchQueue(label: "queue")
class XYZ: /*NSViewController or NSObject*/NSObject, AVCaptureVideoDataOutputSampleBufferDelegate {
func viewDidLoad() {
//super.viewDidLoad()
let session = AVCaptureSession()
let inputDevice = try! self.configureFrontCamera(for: session)
self.configureVideoDataOutput(for: inputDevice.device, resolution: inputDevice.resolution, captureSession: session)
self.prepareVisionRequest()
session.startRunning()
}
fileprivate func highestResolution420Format(for device: AVCaptureDevice) -> (format: AVCaptureDevice.Format, resolution: CGSize)? {
let deviceFormat = device.formats[0]
print(deviceFormat)
let dims = CMVideoFormatDescriptionGetDimensions(deviceFormat.formatDescription)
let resolution = CGSize(width: CGFloat(dims.width), height: CGFloat(dims.height))
return (deviceFormat, resolution)
}
fileprivate func configureFrontCamera(for captureSession: AVCaptureSession) throws -> (device: AVCaptureDevice, resolution: CGSize) {
let deviceDiscoverySession = AVCaptureDevice.DiscoverySession(deviceTypes: [AVCaptureDevice.DeviceType.builtInWideAngleCamera], mediaType: .video, position: AVCaptureDevice.Position.unspecified)
let device = deviceDiscoverySession.devices.first!
let deviceInput = try! AVCaptureDeviceInput(device: device)
captureSession.addInput(deviceInput)
let highestResolution = self.highestResolution420Format(for: device)!
try! device.lockForConfiguration()
device.activeFormat = highestResolution.format
device.unlockForConfiguration()
return (device, highestResolution.resolution)
}
fileprivate func configureVideoDataOutput(for inputDevice: AVCaptureDevice, resolution: CGSize, captureSession: AVCaptureSession) {
videoDataOutput.setSampleBufferDelegate(self, queue: videoDataOutputQueue)
captureSession.addOutput(videoDataOutput)
}
fileprivate func prepareVisionRequest() {
let faceDetectionRequest: VNDetectFaceRectanglesRequest = VNDetectFaceRectanglesRequest(completionHandler: { (request, error) in
print("VNDetectFaceRectanglesRequest completion Handler called")
})
// Start with detection
detectionRequests = [faceDetectionRequest]
}
// MARK: AVCaptureVideoDataOutputSampleBufferDelegate
// Handle delegate method callback on receiving a sample buffer.
public func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
var requestHandlerOptions: [VNImageOption: AnyObject] = [:]
let cameraIntrinsicData = CMGetAttachment(sampleBuffer, key: kCMSampleBufferAttachmentKey_CameraIntrinsicMatrix, attachmentModeOut: nil)
if cameraIntrinsicData != nil {
requestHandlerOptions[VNImageOption.cameraIntrinsics] = cameraIntrinsicData
}
let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer)!
// No tracking object detected, so perform initial detection
let imageRequestHandler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer,
orientation: CGImagePropertyOrientation.up, options: requestHandlerOptions)
try! imageRequestHandler.perform(detectionRequests!)
}
}
let X = XYZ()
X.viewDidLoad()
sleep(9999999)
Lately I am getting this error.
GenerativeModelsAvailability.Parameters: Initialized with invalid language code: en-GB. Expected to receive two-letter ISO 639 code. e.g. 'zh' or 'en'. Falling back to: en
Does anyone know what this is and how it can be resolved. The error does not crash the app
Is there an API that allows iOS app developers to leverage Apple Foundation Models to authorize a user's Apple Intelligence extension, chatGPT login account?
I'm trying to provide a real-time question feature for chatGPT, a logged-in extension account, while leveraging Apple Intelligence's LLM. Is there an API that also affects the extension login account?
Hello, is it allowed to use Foundation Model Framework in submission app for WWDC26? The thing is that Apple Intelligence needs to be enabled in the settings. So, does that mean the jury won't be able to fully utilize the app's AI functionality?
Hi,
I was using Foundation Models in my app, and suddenly it just stopped working from one moment to the next.
To double-check, I created a small test in Playgrounds, but I’m getting the exact same error there too.
#Playground {
let session = LanguageModelSession()
let prompt = "please answer a word"
do {
let response = try await session.respond(to: prompt)
} catch {
print("error is \(error)")
}
}
error is Error Domain=FoundationModels.LanguageModelSession.GenerationError Code=-1 "(null)" UserInfo={NSMultipleUnderlyingErrorsKey=(
"Error Domain=ModelManagerServices.ModelManagerError Code=1026 \"(null)\" UserInfo={NSMultipleUnderlyingErrorsKey=(\n)}"
)}
I’m no longer able to get any response from the framework anywhere, even in a fresh project. It's been 5 days.
Has anyone else experienced this issue or knows what could be causing it?
Thanks in advance!
Tahoe 26.2 beta 1, Xcode 26.1.1, iPhone Air simulator 26.1
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
Hi all! Nice to meet you.,
I am planning to build an iOS application that can:
Capture an image using the camera or select one from the gallery.
Remove the background and keep only the detected main object.
Add a border (outline) around the detected object’s shape.
Apply an animation along that border (e.g., moving light or glowing effect).
Include a transition animation when removing the background — for example, breaking the background into pieces as it disappears.
The app Capword has a similar feature for object isolation, and I’d like to build something like that.
Could you please provide any guidance, frameworks, or sample code related to:
Object segmentation and background removal in Swift (Vision or Core ML).
Applying custom borders and shape animations around detected objects.
Recognizing the object name (e.g., “person”, “cat”, “car”) after segmentation.
Thank you very much for your support.
Best regards,
SINN SOKLYHOR
Hi,
I'm trying to use the new RecognizeDocumentsRequest from the Vision Framework to read a receipt. It looks very promising by being able to read paragraphs, lines and detect data. So far it unfortunately seems to read every line on the receipt as a paragraph and when there is more space on one line it creates two paragraphs.
Is there perhaps an Apple Engineer who knows if this is expected behaviour or if I should file a Feedback for this?
Code setup:
let request = RecognizeDocumentsRequest()
let observations = try await request.perform(on: image)
guard let document = observations.first?.document else {
return
}
for paragraph in document.paragraphs {
print(paragraph.transcript)
for data in paragraph.detectedData {
switch data.match.details {
case .phoneNumber(let data):
print("Phone: \(data)")
case .postalAddress(let data):
print("Postal: \(data)")
case .calendarEvent(let data):
print("Calendar: \(data)")
case .moneyAmount(let data):
print("Money: \(data)")
case .measurement(let data):
print("Measurement: \(data)")
default:
continue
}
}
}
See attached image as an example of a receipt I'd like to parse. The top 3 lines are the name, street, and postal code + city. These are all separate paragraphs. Checking on detectedData does see the street (2nd line) as PostalAddress, but not the complete address. Might that be a location thing since it's a Dutch address.
And lower on the receipt it sees the block with "Pomp 1 95 Ongelood" and the things below also as separate paragraphs. First picking up the left side and after that the right side. So it's something like this:
*
Pomp 1
Volume
Prijs
€
TOTAAL
*
BTW
Netto
21.00 %
95 Ongelood
41,90 l
1.949/ 1
81.66
€
14.17
67.49
Hi all.
My adapter model just won't invoke my tool.
The problem I am having is covered in an older post: https://developer.apple.com/forums/thread/794839?answerId=852262022#852262022
Sadly the thread dies there and no resolution is seen in that thread.
It's worth noting that I have developed an AI chatbot built around LanguageModelSession to which I feed the exact same system prompt that I feed to my training set (pasted further in this post). The AI chatbot works perfectly, the tool is invoked when needed. I am training the adapter model because the base model whilst capable doesn't produce the quality I'm looking for.
So here's the template of an item in my training set:
[
{
'role': 'system',
'content': systemPrompt,
'tools': [TOOL_DEFINITION]
},
{
'role': 'user',
'content': entry['prompt']
},
{
'role': 'assistant',
'content': entry['code']
}
]
where TOOL_DEFINITION =
{
'type': 'function',
'function': {
'name': 'WriteUbersichtWidgetToFileSystem',
'description': 'Writes an Übersicht Widget to the file system. Call this tool as the last step in processing a prompt that generates a widget.',
'parameters': {
'type': 'object',
'properties': {
'jsxContent': {
'type': 'string',
'description': 'Complete JSX code for an Übersicht widget. This should include all required exports: command, refreshFrequency, render, and className. The JSX should be a complete, valid Übersicht widget file.'
}
},
'required': ['jsxContent']
}
}
... and systemPrompt =
A conversation between a user and a helpful assistant. You are an Übersicht widget designer. Create Übersicht widgets when requested by the user.
IMPORTANT: You have access to a tool called WriteUbersichtWidgetToFileSystem. When asked to create a widget, you MUST call this tool.
### Tool Usage:
Call WriteUbersichtWidgetToFileSystem with complete JSX code that implements the Übersicht Widget API. Generate custom JSX based on the user's specific request - do not copy the example below.
### Übersicht Widget API (REQUIRED):
Every Übersicht widget MUST export these 4 items:
- export const command: The bash command to execute (string)
- export const refreshFrequency: Refresh rate in milliseconds (number)
- export const render: React component function that receives {output} prop (function)
- export const className: CSS positioning for absolute placement (string)
Example format (customize for each request):
WriteUbersichtWidgetToFileSystem({jsxContent: `export const command = "echo hello"; export const refreshFrequency = 1000; export const render = ({output}) => { return <div>{output}</div>; }; export const className = "top: 20px; left: 20px;"`})
### Rules:
- The terms "ubersicht widget", "widget", "a widget", "the widget" must all be interpreted as "Übersicht widget"
- Generate complete, valid JSX code that follows the Übersicht widget API
- When you generate a widget, don't just show JSON or code - you MUST call the WriteUbersichtWidgetToFileSystem tool
- Report the results to the user after calling the tool
### Examples:
- "Generate a Übersicht widget" → Use WriteUbersichtWidgetToFileSystem tool
- "Can you add a widget that shows the time" → Use WriteUbersichtWidgetToFileSystem tool
- "Create a widget with a button" → Use WriteUbersichtWidgetToFileSystem tool
When the script that I use to compose the full training set is executed, entry['prompt'] and entry['code'] contain the prompt and the resulting JSX code for one of the examples I'm feeding to the training session. This is repeated for about 60 such examples that I have in my sample data collection.
Thanks for any help.
Michael
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
My app used app intents. And when user said "Prüfung der Bluetooth Funktion", screen can show the whole words. But in my app, it only can get "Bluetooth Funktion". This behaviour only happened in German version. In English version, everything worked well.
Is anyone can support me? Why German version siri cut my words?
After more than a year since the announcement, I'm still unable to get this feature working properly and wondering if there are known issues or missing implementation details.
Current Setup:
Device: iPhone 16 Pro Max
iOS: 26 beta 3
Development: Tested on both Xcode 16 and Xcode 26
Implementation: Following the official documentation examples
The Problem:
Semantic search simply doesn't work. Lexical search functions normally, but enabling semantic search produces identical results to having it disabled. It's as if the feature isn't actually processing.
Error Output (Xcode 26):
[QPNLU][qid=5] Error Domain=com.apple.SpotlightEmbedding.EmbeddingModelError Code=-8007 "Text embedding generation timeout (timeout=100ms)"
[CSUserQuery][qid=5] got a nil / empty embedding data dictionary
[CSUserQuery][qid=5] semanticQuery failed to generate, using "(false)"
In Xcode 16, there are no error messages at all - the semantic search just silently fails.
Missing Resources:
The sample application mentioned during the WWDC24 presentation doesn't appear to have been released, which makes it difficult to verify if my implementation is correct.
Would really appreciate any guidance or clarification on the current status of this feature. Has anyone in the community successfully implemented this?
I'm experimenting with using the Foundation Models framework to do news summarization in an RSS app but I'm finding that a lot of articles are getting kicked back with a vague message about guardrails.
This seems really common with political news but we're talking mainstream stuff, i.e. Politico, etc.
If the models are this restrictive, this will be tough to use. Is this intended?
FB17904424
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
Hi everyone,
I'm developing an iOS app using Foundation Models and I've hit a critical limitation that I believe affects many developers and millions of users.
The Issue
Foundation Models requires the device system language to be one of the supported languages. If a user has their device set to an unsupported language (Catalan, Dutch, Swedish, Polish, Danish, Norwegian, Finnish, Czech, Hungarian, Greek, Romanian, and many others), SystemLanguageModel.isSupported returns false and the framework is completely unavailable.
Why This Is Problematic
Scenario: A Catalan user has their iPhone in Catalan (native language). They want to use an AI chat app in Spanish or English (languages they speak fluently).
Current situation:
❌ Foundation Models: Completely unavailable
✅ OpenAI GPT-4: Works perfectly
✅ Anthropic Claude: Works perfectly
✅ Any cloud-based AI: Works perfectly
The user must choose between:
Keep device in Catalan → Cannot use Foundation Models at all
Change entire device to Spanish → Can use Foundation Models but terrible UX
Impact
This affects:
Millions of users in regions where unsupported languages are official
Multilingual users who prefer their device in their native language but can comfortably interact with AI in English/Spanish
Developers who cannot deploy Foundation Models-based apps in these markets
Privacy-conscious users who are ironically forced to use cloud AI instead of on-device AI
What We Need
One of these solutions would solve the problem:
Option 1: Per-app language override (preferred)
// Proposed API
let session = try await LanguageModelSession(preferredLanguage: "es-ES")
Option 2: Faster rollout of additional languages (particularly EU languages)
Option 3: Allow fallback to user-selected supported language when system language is unsupported
Technical Details
Current behavior:
// Device in Catalan
let isAvailable = SystemLanguageModel.isSupported
// Returns false
// No way to override or specify alternative language
Why This Matters
Apple Intelligence and Foundation Models are amazing for privacy and performance. But this language restriction makes the most privacy-focused AI solution less accessible than cloud alternatives. This seems contrary to Apple's values of accessibility and user choice.
Questions for the Community
Has anyone else encountered this limitation?
Are there any workarounds I'm missing?
Has anyone successfully filed feedback about this?(Please share FB number so we can reference it)
Are there any sessions or labs where this has been discussed?
Thanks for reading. I'd love to hear if others are facing this and how you're handling it.
I am writing to inquire about content exclusion capabilities within Apple Intelligence, particularly regarding the use of configuration files such as .aiignore or .aiexclude—similar to what exists in other AI-assisted coding tools. These mechanisms are highly valuable in managing what content AI systems can access, especially in environments that involve sensitive code or proprietary frameworks.
I would appreciate it if anyone could clarify whether Apple Intelligence currently supports any exclusion configuration for AI-assisted features. If so, could you kindly provide documentation or guidance on how developers can implement these controls?
If not, Is there any plan to include such feature in future updates?
Hi everyone
Im currently developing an object detection model that shall identify up to seven classes in an image. While im usually doing development with basic python and the ultralytics library, i thought i would like to give CreateML a shot. The experience is actually very nice, except for the fact that the model seem not to be using any ANE or GPU (MPS) for accelerated training.
On https://developer.apple.com/machine-learning/create-ml/ it states: "On-device training Train models blazingly fast right on your Mac while taking advantage of CPU and GPU."
Am I doing something wrong?
Im running the training on
Apple M1 Pro 16GB
MacOS 26.1 (Tahoe)
Xcode 26.1 (Build version 17B55)
It would be super nice to get some feedback or instructions.
Thank you in advance!
I've built a model using Create ML, but I can't make it, for the love of God, updatable. I can't find any checkbox or anything related. It's an Activity Classifier, if it matters.
I want to continue training it on-device using MLUpdateTask, but the model, as exported from Create ML, fails with error: Domain=com.apple.CoreML Code=6 "Failed to unarchive update parameters. Model should be re-compiled." UserInfo={NSLocalizedDescription=Failed to unarchive update parameters. Model should be re-compiled.}