During testing the “Bringing advanced speech-to-text capabilities to your app” sample app demonstrating the use of iOS 26 SpeechAnalyzer, I noticed that the language model for the English locale was presumably already downloaded. Upon checking the documentation of AssetInventory, I found out that indeed, the language model can be preinstalled on the system.
Can someone from the dev team share more info about what assets are preinstalled by the system? For example, can we safely assume that the English language model will almost certainly be already preinstalled by the OS if the phone has the English locale?
Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
I'm working on localizing my prompts to support multiple languages, and in some cases my prompts has String interpolated Generable objects. for example:
"Given the following workout routine: \(routine), suggest one additional exercise to complement it."
In the Strings dictionary, I'm only able to select String, Int or Double parameters using %@ and %lld.
Has anyone found a way to accomplish this?
When I use ChatGPT in Xcode, the following error is displayed:
It was working fine before, but suddenly it became like this, without changing any configuration. Why?
Topic:
Machine Learning & AI
SubTopic:
Apple Intelligence
Hi,
I’m developing an app targeting iOS 26, using the new FoundationModels framework to perform on-device LLM inference. I’m currently testing memory usage.
Does the memory used by FoundationModels—including model weights, KV cache, and any inference-related buffers—count toward my app’s Jetsam memory limit, or is any of it managed separately by the system?
I may need to run two concurrent inferences, each with a 4096-token context window. Is this explicitly supported or allowed by FoundationModels on iOS 26? Would this significantly increase the risk of memory-based termination?
Thanks in advance for any clarification.
Thanks.
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
I have a Generable type with many elements. I am using a stream() to incrementally process the output (Generable.PartiallyGenerated?) content.
At the end, I want to pass the final version (not partially generated) to another function.
I cannot seem to find a good way to convert from a MyGenerable.PartiallyGenerated to a MyGenerable.
Am I missing some functionality in the APIs?
Hey Devs,
I'm trying to create my own Real Time Text detection like this Apple project. https://developer.apple.com/documentation/vision/extracting-phone-numbers-from-text-in-images
I want to use the new iOS18 RecognizeTextRequest instead of the old VNRecognizeTextRequest in my SwiftUI project.
This is my delegate code with the camera setup. I removed region of interest for debugging but I'm trying to scan English words in books. The idea is to get one word in the ROI in the future. But I can't even get proper words so testing without ROI incase my math is wrong.
@Observable
class CameraManager: NSObject, AVCapturePhotoCaptureDelegate
...
override init() {
super.init()
setUpVisionRequest()
}
private func setUpVisionRequest() {
textRequest = RecognizeTextRequest(.revision3)
}
...
func setup() -> Bool {
captureSession.beginConfiguration()
guard
let captureDevice = AVCaptureDevice.default(
.builtInWideAngleCamera, for: .video, position: .back)
else {
return false
}
self.captureDevice = captureDevice
guard let deviceInput = try? AVCaptureDeviceInput(device: captureDevice)
else {
return false
}
/// Check whether the session can add input.
guard captureSession.canAddInput(deviceInput) else {
print("Unable to add device input to the capture session.")
return false
}
/// Add the input and output to session
captureSession.addInput(deviceInput)
/// Configure the video data output
videoDataOutput.setSampleBufferDelegate(
self, queue: videoDataOutputQueue)
if captureSession.canAddOutput(videoDataOutput) {
captureSession.addOutput(videoDataOutput)
videoDataOutput.connection(with: .video)?
.preferredVideoStabilizationMode = .off
} else {
return false
}
// Set zoom and autofocus to help focus on very small text
do {
try captureDevice.lockForConfiguration()
captureDevice.videoZoomFactor = 2
captureDevice.autoFocusRangeRestriction = .near
captureDevice.unlockForConfiguration()
} catch {
print("Could not set zoom level due to error: \(error)")
return false
}
captureSession.commitConfiguration()
// potential issue with background vs dispatchqueue ??
Task(priority: .background) {
captureSession.startRunning()
}
return true
}
}
// Issue here ???
extension CameraManager: AVCaptureVideoDataOutputSampleBufferDelegate {
func captureOutput(
_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer,
from connection: AVCaptureConnection
) {
guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
Task {
textRequest.recognitionLevel = .fast
textRequest.recognitionLanguages = [Locale.Language(identifier: "en-US")]
do {
let observations = try await textRequest.perform(on: pixelBuffer)
for observation in observations {
let recognizedText = observation.topCandidates(1).first
print("recognized text \(recognizedText)")
}
} catch {
print("Recognition error: \(error.localizedDescription)")
}
}
}
}
The results I get look like this ( full page of English from a any book)
recognized text Optional(RecognizedText(string: e bnUI W4, confidence: 0.5))
recognized text Optional(RecognizedText(string: ?'U, confidence: 0.3))
recognized text Optional(RecognizedText(string: traQt4, confidence: 0.3))
recognized text Optional(RecognizedText(string: li, confidence: 0.3))
recognized text Optional(RecognizedText(string: 15,1,#, confidence: 0.3))
recognized text Optional(RecognizedText(string: jllÈ, confidence: 0.3))
recognized text Optional(RecognizedText(string: vtrll, confidence: 0.3))
recognized text Optional(RecognizedText(string: 5,1,: 11, confidence: 0.5))
recognized text Optional(RecognizedText(string: 1141, confidence: 0.3))
recognized text Optional(RecognizedText(string: jllll ljiiilij41, confidence: 0.3))
recognized text Optional(RecognizedText(string: 2f4, confidence: 0.3))
recognized text Optional(RecognizedText(string: ktril, confidence: 0.3))
recognized text Optional(RecognizedText(string: ¥LLI, confidence: 0.3))
recognized text Optional(RecognizedText(string: 11[Itl,, confidence: 0.3))
recognized text Optional(RecognizedText(string: 'rtlÈ131, confidence: 0.3))
Even with ROI set to a specific rectangle Normalized to Vision, I get the same results with single characters returning gibberish.
Any help would be amazing thank you.
Am I using the buffer right ?
Am I using the new perform(on: CVPixelBuffer) right ?
Maybe I didn't set up my camera properly? I can provide code
@Generable
enum Breakfast {
case waffles
case pancakes
case bagels
case eggs
}
do {
let session = LanguageModelSession()
let userInput = "I want something sweet."
let prompt = "Pick the ideal breakfast for request: (userInput)"
let response = try await session.respond(to: prompt,generating: Breakfast.self)
print(response.content)
} catch let error {
print(error)
}
i want to test the @Generable demo but get error with below:decodingFailure(FoundationModels.LanguageModelSession.GenerationError.Context(debugDescription: "Failed to convert text into into GeneratedContent\nText: waffles", underlyingErrors: [Swift.DecodingError.dataCorrupted(Swift.DecodingError.Context(codingPath: [], debugDescription: "The given data was not valid JSON.", underlyingError: Optional(Error Domain=NSCocoaErrorDomain Code=3840 "Unexpected character 'w' around line 1, column 1." UserInfo={NSJSONSerializationErrorIndex=0, NSDebugDescription=Unexpected character 'w' around line 1, column 1.})))]))
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
Hey,
Would be great to have an equivalent of toolCallId for both toolCall and toolResult in the transcript. Otherwise, it is hard to connect tool calls with their respective responses, when there were multiple parallel calls to the same tool.
Thanks!
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
Hello,
We have been encountering a persistent crash in our application, which is deployed exclusively on iPad devices. The crash occurs in the following code block:
let requestHandler = ImageRequestHandler(paddedImage)
var request = CoreMLRequest(model: model)
request.cropAndScaleAction = .scaleToFit
let results = try await requestHandler.perform(request)
The client using this code is wrapped inside an actor, following Swift concurrency principles.
The issue has been consistently reproduced across multiple iPadOS versions, including:
iPad OS - 18.4.0
iPad OS - 18.4.1
iPad OS - 18.5.0
This is the crash log -
Crashed: com.apple.VN.detectorSyncTasksQueue.VNCoreMLTransformer
0 libobjc.A.dylib 0x7b98 objc_retain + 16
1 libobjc.A.dylib 0x7b98 objc_retain_x0 + 16
2 libobjc.A.dylib 0xbf18 objc_getProperty + 100
3 Vision 0x326300 -[VNCoreMLModel predictWithCVPixelBuffer:options:error:] + 148
4 Vision 0x3273b0 -[VNCoreMLTransformer processRegionOfInterest:croppedPixelBuffer:options:qosClass:warningRecorder:error:progressHandler:] + 748
5 Vision 0x2ccdcc __119-[VNDetector internalProcessUsingQualityOfServiceClass:options:regionOfInterest:warningRecorder:error:progressHandler:]_block_invoke_5 + 132
6 Vision 0x14600 VNExecuteBlock + 80
7 Vision 0x14580 __76+[VNDetector runSuccessReportingBlockSynchronously:detector:qosClass:error:]_block_invoke + 56
8 libdispatch.dylib 0x6c98 _dispatch_block_sync_invoke + 240
9 libdispatch.dylib 0x1b584 _dispatch_client_callout + 16
10 libdispatch.dylib 0x11728 _dispatch_lane_barrier_sync_invoke_and_complete + 56
11 libdispatch.dylib 0x7fac _dispatch_sync_block_with_privdata + 452
12 Vision 0x14110 -[VNControlledCapacityTasksQueue dispatchSyncByPreservingQueueCapacity:] + 60
13 Vision 0x13ffc +[VNDetector runSuccessReportingBlockSynchronously:detector:qosClass:error:] + 324
14 Vision 0x2ccc80 __119-[VNDetector internalProcessUsingQualityOfServiceClass:options:regionOfInterest:warningRecorder:error:progressHandler:]_block_invoke_4 + 336
15 Vision 0x14600 VNExecuteBlock + 80
16 Vision 0x2cc98c __119-[VNDetector internalProcessUsingQualityOfServiceClass:options:regionOfInterest:warningRecorder:error:progressHandler:]_block_invoke_3 + 256
17 libdispatch.dylib 0x1b584 _dispatch_client_callout + 16
18 libdispatch.dylib 0x6ab0 _dispatch_block_invoke_direct + 284
19 Vision 0x2cc454 -[VNDetector internalProcessUsingQualityOfServiceClass:options:regionOfInterest:warningRecorder:error:progressHandler:] + 632
20 Vision 0x2cd14c __111-[VNDetector processUsingQualityOfServiceClass:options:regionOfInterest:warningRecorder:error:progressHandler:]_block_invoke + 124
21 Vision 0x14600 VNExecuteBlock + 80
22 Vision 0x2ccfbc -[VNDetector processUsingQualityOfServiceClass:options:regionOfInterest:warningRecorder:error:progressHandler:] + 340
23 Vision 0x125410 __swift_memcpy112_8 + 4852
24 libswift_Concurrency.dylib 0x5c134 swift::runJobInEstablishedExecutorContext(swift::Job*) + 292
25 libswift_Concurrency.dylib 0x5d5c8 swift_job_runImpl(swift::Job*, swift::SerialExecutorRef) + 156
26 libdispatch.dylib 0x13db0 _dispatch_root_queue_drain + 364
27 libdispatch.dylib 0x1454c _dispatch_worker_thread2 + 156
28 libsystem_pthread.dylib 0x9d0 _pthread_wqthread + 232
29 libsystem_pthread.dylib 0xaac start_wqthread + 8
We found an issue similar to us - https://developer.apple.com/forums/thread/770771.
But the crash logs are quite different, we believe this warrants further investigation to better understand the root cause and potential mitigation strategies.
Please let us know if any additional information would help diagnose this issue.
When I am doing an uncached load of CoreML model on ANE, I received this warning in Xcode console
Type of hiddenStates in function main's I/O contains unknown strides. Using unknown strides for MIL tensor buffers with unknown shapes is not recommended in E5ML. Please use row_alignment_in_bytes property instead. Refer to https://e5-ml.apple.com/more-info/memory-layouts.html for more information.
However, the web link does not seem to be working. Where can I find more information about about this and how can I fix it?
Topic:
Machine Learning & AI
SubTopic:
Core ML
I'm using a custom create ML model to classify the movement of a user's hand in a game,
The classifier has 3 different spell movements, but my code constantly predicts all of them at an equal 1/3 probability regardless of movement which leads me to believe my code isn't correct (as opposed to the model) which in CreateML at least gives me a heavily weighted prediction
My code is below.
On adding debug prints everywhere all the data looks good to me and matches similar to my test CSV data
So I'm thinking my issue must be in the setup of my model code?
/// Feeds samples into the model and keeps a sliding window of the last N frames.
final class WandGestureStreamer {
static let shared = WandGestureStreamer()
private let model: SpellActivityClassifier
private var samples: [Transform] = []
private let windowSize = 100 // number of frames the model expects
/// RNN hidden state passed between inferences
private var stateIn: MLMultiArray
/// Last transform dropped from the window for continuity
private var lastDropped: Transform?
private init() {
let config = MLModelConfiguration()
self.model = try! SpellActivityClassifier(configuration: config)
// Initialize stateIn to the model’s required shape
let constraint = self.model.model.modelDescription
.inputDescriptionsByName["stateIn"]!
.multiArrayConstraint!
self.stateIn = try! MLMultiArray(shape: constraint.shape, dataType: .double)
}
/// Call once per frame with the latest wand position (or any feature vector).
func appendSample(_ sample: Transform) {
samples.append(sample)
// drop oldest frame if over capacity, retaining it for delta at window start
if samples.count > windowSize {
lastDropped = samples.removeFirst()
}
}
func classifyIfReady(threshold: Double = 0.6) -> (label: String, confidence: Double)? {
guard samples.count == windowSize else { return nil }
do {
let input = try makeInput(initialState: stateIn)
let output = try model.prediction(input: input)
// Save state for continuity
stateIn = output.stateOut
let best = output.label
let conf = output.labelProbability[best] ?? 0
// If you’ve recognized a gesture with high confidence:
if conf > threshold {
return (best, conf)
} else {
return nil
}
} catch {
print("Error", error.localizedDescription, error)
return nil
}
}
/// Constructs a SpellActivityClassifierInput from recorded wand transforms.
func makeInput(initialState: MLMultiArray) throws -> SpellActivityClassifierInput {
let count = samples.count as NSNumber
let shape = [count]
let timeArr = try MLMultiArray(shape: shape, dataType: .double)
let dxArr = try MLMultiArray(shape: shape, dataType: .double)
let dyArr = try MLMultiArray(shape: shape, dataType: .double)
let dzArr = try MLMultiArray(shape: shape, dataType: .double)
let rwArr = try MLMultiArray(shape: shape, dataType: .double)
let rxArr = try MLMultiArray(shape: shape, dataType: .double)
let ryArr = try MLMultiArray(shape: shape, dataType: .double)
let rzArr = try MLMultiArray(shape: shape, dataType: .double)
for (i, sample) in samples.enumerated() {
let previousSample = i > 0 ? samples[i - 1] : lastDropped
let model = WandMovementRecording.DataModel(transform: sample, previous: previousSample)
// print("model", model)
timeArr[i] = NSNumber(value: model.timestamp)
dxArr[i] = NSNumber(value: model.dx)
dyArr[i] = NSNumber(value: model.dy)
dzArr[i] = NSNumber(value: model.dz)
let rot = model.rotation
rwArr[i] = NSNumber(value: rot.w)
rxArr[i] = NSNumber(value: rot.x)
ryArr[i] = NSNumber(value: rot.y)
rzArr[i] = NSNumber(value: rot.z)
}
return SpellActivityClassifierInput(
dx: dxArr, dy: dyArr, dz: dzArr,
rotation_w: rwArr, rotation_x: rxArr, rotation_y: ryArr, rotation_z: rzArr,
timestamp: timeArr,
stateIn: initialState
)
}
}
I have a question. In China, long pressing a picture in the album can segment the target. Is this model a local model? Is there any information? Can developers use it?
On macOS Tahoe26.0, iOS 26.0 (23A5287g) not emulator, Xcode 26.0 beta 3 (17A5276g)
Follow this tutorial Testing your asset packs locally The start the test server command I use this command line to start the test server:xcrun ba-serve --host 192.168.0.109 test.aar The terminal showThe content displayed on the terminal is: Loading asset packs…
Loading the asset pack at “test.aar”…
Listening on port 63125…… Choose an identity in the panel to continue. Listening on port 63125…
running the project, Xcode reports an error:Download failed: Could not connect to the server. I use iPhone safari visit this website: https://192.168.0.109:63125, on the page display "Hello, world!"
There are too few error messages in both of the above questions. I have no idea what the specific reasons are.I hope someone can offer some guidance. Best Regards.
{
"assetPackID": "testVideoAssetPack",
"downloadPolicy": {
"prefetch": {
"installationEventTypes": ["firstInstallation", "subsequentUpdate"]
}
},
"fileSelectors": [
{
"file": "video/test.mp4"
}
],
"platforms": [
"iOS"
]
}
this is my Manifest.json
Hi all, I'm working on an app that utilizes the FoundationModels found in iOS 26. I updated my phone to iOS 26 beta 3 and am now receiving the following error when trying to run code that worked in beta 2:
Al Error: The operation couldn't be completed. (FoundationModels.LanguageModelSession.Genera-
tionError error 2.)
I admit I'm a bit of a new developer, but any idea if this is an issue with beta 3 or work that I'll need to do to adapt my code to some changes in the AI API?
Thank you!
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
I've been successfully integrating the Foundation Models framework into my healthcare app using structured generation with @Generable schemas. While my initial testing (20-30 iterations) shows promising results, I need to validate consistency and reliability at scale before production deployment.
Question
Is there a recommended approach for automated, large-scale testing of Foundation Models responses?
Specifically, I'm looking to:
Automate 1000+ test iterations with consistent prompts and structured schemas
Measure response consistency across identical inputs
Validate structured output reliability (proper schema adherence, no generation failures)
Collect performance metrics (TTFT, TPS) for optimization
Specific Questions
Framework Limitations: Are there any undocumented rate limits or thermal throttling considerations for rapid session creation/destruction?
Performance Tools: Can Xcode's Foundation Models Instrument be used programmatically, or only through Instruments UI?
Automation Integration: Any recommendations for integrating with testing frameworks?
Session Reuse: Is it better to reuse a single LanguageModelSession or create fresh sessions for each test iteration?
Use Case Context
My wellness app provides medically safe activity recommendations based on user health profiles. The Foundation Models framework processes health context and generates structured recommendations for exercises, nutrition, and lifestyle activities. Given the safety implications of providing health-related guidance, I need rigorous validation to ensure the model consistently produces appropriate, well-formed recommendations across diverse user scenarios and health conditions.
Has anyone in the community built similar large-scale testing infrastructure for Foundation Models? Any insights on best practices or potential pitfalls would be greatly appreciated.
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
Here's the result:
Very weird.
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
Hey,
I receive GenerableContent as follows:
let response = try await session.respond(to: "", schema: generationSchema)
And it wraps GeneratedJSON which seems to be private.
What is the best way to get a string / raw value out of it? I noticed it could theoretically be accessed via transcriptEntries but it's not ideal.
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
I am trying to create a slightly different version of the content tagging code in the documentation:
https://developer.apple.com/documentation/foundationmodels/systemlanguagemodel/usecase/contenttagging
In the playground I am getting an "Inference Provider crashed with 2:5" error.
I have no idea what that means or how to address the error. Any assistance would be appreciated.
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
Does anyone know if ExecuTorch is officially supported or has been successfully used on visionOS? If so, are there any specific build instructions, example projects, or potential issues (like sandboxing or memory limitations) to be aware of when integrating it into an Xcode project for the Vision Pro?
While ExecuTorch has support for iOS, I can't find any official documentation or community examples specifically mentioning visionOS.
Thanks.
I'm experimenting with using the Foundation Models framework to do news summarization in an RSS app but I'm finding that a lot of articles are getting kicked back with a vague message about guardrails.
This seems really common with political news but we're talking mainstream stuff, i.e. Politico, etc.
If the models are this restrictive, this will be tough to use. Is this intended?
FB17904424
Topic:
Machine Learning & AI
SubTopic:
Foundation Models