Hi there,
I have a custom keypoint detection model and want to use it via vision's CoremlRequest API. Here's some complication for input and output:
For input My model expect 512x512 a image. Which would be resized and padded from a 1920x1080 frame. I use the .scaleToFit option, but can I also specify the color used for padding?
For output:
My model output a CoreMLFeatureValueObservation, can I have it output in a format vision recognizes? such as joints/keypoints
If my model is able to output in a format vision recognizes, would it take care to restoring the coordinates back to the original frame? (undo the padding) If not, how do I restore it from .scaletofit option?
Best,
Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
Hello fellow developers,
I'm the founder of a FinTech startup, Cent Capital (https://cent.capital), where we are building an AI-powered financial co-pilot.
We're deeply exploring the Apple ecosystem to create a more proactive and ambient user experience. A core part of our vision is to use App Intents and the Shortcuts app to surface personalized financial insights without the user always needing to open our app. For example, suggesting a Shortcut like, "What's my spending in the 'Dining Out' category this month?" or having an App Intent proactively surface an insight like, "Your 'Subscriptions' budget is almost full."
My question for the community is about the architectural and user experience best practices for this.
How are you thinking about the balance between providing rich, actionable insights via Intents without being overly intrusive or "spammy" to the user?
What are the best practices for designing the data model that backs these App Intents for a complex domain like personal finance?
Are there specific performance or privacy considerations we should be aware of when surfacing potentially sensitive financial data through these system-level integrations?
We believe this is the future of FinTech apps on iOS and would love to hear how other developers are thinking about this challenge.
Thanks for your insights!
I would like to write a macOS application that uses on-device AI (FoundationModels).
I don’t understand how to, practically, give it access to my documents, photos, or contacts and be able to ask it a question like: “Find the document that talks about this topic.”
Do I need to manually retrieve the data and provide it in the form of a prompt? Or is FoundationModels capable of accessing it on its own?
Thanks
After more than a year since the announcement, I'm still unable to get this feature working properly and wondering if there are known issues or missing implementation details.
Current Setup:
Device: iPhone 16 Pro Max
iOS: 26 beta 3
Development: Tested on both Xcode 16 and Xcode 26
Implementation: Following the official documentation examples
The Problem:
Semantic search simply doesn't work. Lexical search functions normally, but enabling semantic search produces identical results to having it disabled. It's as if the feature isn't actually processing.
Error Output (Xcode 26):
[QPNLU][qid=5] Error Domain=com.apple.SpotlightEmbedding.EmbeddingModelError Code=-8007 "Text embedding generation timeout (timeout=100ms)"
[CSUserQuery][qid=5] got a nil / empty embedding data dictionary
[CSUserQuery][qid=5] semanticQuery failed to generate, using "(false)"
In Xcode 16, there are no error messages at all - the semantic search just silently fails.
Missing Resources:
The sample application mentioned during the WWDC24 presentation doesn't appear to have been released, which makes it difficult to verify if my implementation is correct.
Would really appreciate any guidance or clarification on the current status of this feature. Has anyone in the community successfully implemented this?
Hi,
testing latest tensorflow-metal plugin with tensorflow 2.20 doesn't work..
using python
Python 3.12.11 (main, Jun 3 2025, 15:41:47) [Clang 17.0.0 (clang-1700.0.13.3)] on darwin
simple testing shows error:
import tensorflow as tf
Traceback (most recent call last):
File "", line 1, in
File "/Users/obg/npu/venv-tf/lib/python3.12/site-packages/tensorflow/init.py", line 438, in
_ll.load_library(_plugin_dir)
File "/Users/obg/npu/venv-tf/lib/python3.12/site-packages/tensorflow/python/framework/load_library.py", line 151, in load_library
py_tf.TF_LoadLibrary(lib)
tensorflow.python.framework.errors_impl.NotFoundError: dlopen(/Users/obg/npu/venv-tf/lib/python3.12/site-packages/tensorflow-plugins/libmetal_plugin.dylib, 0x0006): Library not loaded: @rpath/_pywrap_tensorflow_internal.so
Referenced from: <8B62586B-B082-3113-93AB-FD766A9960AE> /Users/obg/npu/venv-tf/lib/python3.12/site-packages/tensorflow-plugins/libmetal_plugin.dylib
Reason: tried: '/Users/obg/npu/venv-tf/lib/python3.12/site-packages/tensorflow-plugins/../_solib_darwin_arm64/_U@local_Uconfig_Utf_S_S_C_Upywrap_Utensorflow_Uinternal___Uexternal_Slocal_Uconfig_Utf/_pywrap_tensorflow_internal.so' (no such file), '/Users/obg/npu/venv-tf/lib/python3.12/site-packages/tensorflow-plugins/../_solib_darwin_arm64/_U@local_Uconfig_Utf_S_S_C_Upywrap_Utensorflow_Uinternal___Uexternal_Slocal_Uconfig_Utf/_pywrap_tensorflow_internal.so' (no such file), '/opt/homebrew/lib/_pywrap_tensorflow_internal.so' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/opt/homebrew/lib/_pywrap_tensorflow_internal.so' (no such file)
tf.config.experimental.list_physical_devices('GPU')
Traceback (most recent call last):
File "", line 1, in
NameError: name 'tf' is not defined
I fixed this error by copying _pywrap_tensorflow_internal.so where it's searched..
1)mkdir /Users/obg/npu/venv-tf/lib/python3.12/site-packages/tensorflow-plugins/../_solib_darwin_arm64
2)mkdir /Users/obg/npu/venv-tf/lib/python3.12/site-packages/tensorflow-plugins/../_solib_darwin_arm64/_U@local_Uconfig_Utf_S_S_C_Upywrap_Utensorflow_Uinternal___Uexternal_Slocal_Uconfig_Utf/
3)cp /Users/obg/npu/venv-tf/lib/python3.12/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so /Users/obg/npu/venv-tf/lib/python3.12/site-packages/tensorflow-plugins/../_solib_darwin_arm64/_U@local_Uconfig_Utf_S_S_C_Upywrap_Utensorflow_Uinternal___Uexternal_Slocal_Uconfig_Utf/
then fails symbol not found:
Symbol not found: __ZN10tensorflow28_AttrValue_default_instance_E
in libmetal_plugin.dylib
full log:
with import tensorflow as tf
Traceback (most recent call last):
File "", line 1, in
File "/Users/obg/npu/venv-tf/lib/python3.12/site-packages/tensorflow/init.py", line 438, in
_ll.load_library(_plugin_dir)
File "/Users/obg/npu/venv-tf/lib/python3.12/site-packages/tensorflow/python/framework/load_library.py", line 151, in load_library
py_tf.TF_LoadLibrary(lib)
tensorflow.python.framework.errors_impl.NotFoundError: dlopen(/Users/obg/npu/venv-tf/lib/python3.12/site-packages/tensorflow-plugins/libmetal_plugin.dylib, 0x0006): Symbol not found: __ZN10tensorflow28_AttrValue_default_instance_E
Referenced from: <8B62586B-B082-3113-93AB-FD766A9960AE> /Users/obg/npu/venv-tf/lib/python3.12/site-packages/tensorflow-plugins/libmetal_plugin.dylib
Expected in: <2FF91C8B-0CB6-3E66-96B7-092FDF36772E> /Users/obg/npu/venv-tf/lib/python3.12/site-packages/_solib_darwin_arm64/_U@local_Uconfig_Utf_S_S_C_Upywrap_Utensorflow_Uinternal___Uexternal_Slocal_Uconfig_Utf/_pywrap_tensorflow_internal.so
Hello,
Are there any plans to compile a python 3.13 version of tensorflow-metal?
Just got my new Mac mini and the automatically installed version of python installed by brew is python 3.13 and while if I was in a hurry, I could manage to get python 3.12 installed and use the corresponding tensorflow-metal version but I'm not in a hurry.
Many thanks,
Alan
I'm playing with the new Vision API for iOS18, specifically with the new CalculateImageAestheticsScoresRequest API.
When I try to perform the image observation request I get this error:
internalError("Error Domain=NSOSStatusErrorDomain Code=-1 \"Failed to create espresso context.\" UserInfo={NSLocalizedDescription=Failed to create espresso context.}")
The code is pretty straightforward:
if let image = image {
let request = CalculateImageAestheticsScoresRequest()
Task {
do {
let cgImg = image.cgImage!
let observations = try await request.perform(on: cgImg)
let description = observations.description
let score = observations.overallScore
print(description)
print(score)
} catch {
print(error)
}
}
}
I'm running it on a M2 using the simulator.
Is it a bug? What's wrong?
I found what might be a bug with enabling Apple Intelligence when switching languages. When my iPhone's language is set to Catalan, the Apple Intelligence is disabled because it is not available for that language. Switching to Spanish doesn't activate it, and it still shows the same message of being unavailable, this time saying not available in Spanish (which is not true). However, it is enabled when the phone is rebooted.
Once at this point, the bug becomes even weirder. Having the iPhone language set to Spanish and with Apple Intelligence on, I switch the language to Catalan, and the feature remains enabled. After I ask a query in Catalan, it surprisingly understands it and works, but then it gets disabled.
Apart from that, as user feedback, I would love to activate Apple Intelligence in an available language other than my device's language. That's how I always used Siri (iPhone in Catalan, Siri in Spanish).
Thanks!
Topic:
Machine Learning & AI
SubTopic:
Apple Intelligence
Tags:
Siri and Voice
Internationalization
Localization
Apple Intelligence
Lookin for J - is this a safe place for discussing full apps ive built but not submitted or shared , I have maybe over 100 but had been unaware any assistance was provided..
is there a formal process to take to submit an app fro review to improve OS, other than during App Store review.
Topic:
Machine Learning & AI
SubTopic:
Apple Intelligence
Tags:
Design
Developer Tools
iCloud Drive
Xcode
In this online session, you can code along with us as we build generative AI features into a sample app live in Xcode. We'll guide you through implementing core features like basic text generation, as well as advanced topics like guided generation for structured data output, streaming responses for dynamic UI updates, and tool calling to retrieve data or take an action.
Check out these resources to get started:
Download the project files: https://developer.apple.com/events/re...
Explore the code along guide: https://developer.apple.com/events/re...
Join the live Q&A: https://developer.apple.com/videos/pl...
Agenda – All times PDT
10 a.m.: Welcome and Xcode setup
10:15 a.m.: Framework basics, guided generation, and building prompts
11 a.m.: Break
11:10 a.m.: UI streaming, tool calling, and performance optimization
11:50 a.m.: Wrap up
All are welcome to attend the session. To actively code along, you'll need a Mac with Apple silicon that supports Apple Intelligence running the latest release of macOS Tahoe 26 and Xcode 26.
If you have questions after the code along concludes please share a post here in the forums and engage with the community.
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
I am using the iPhone 17 Pro simulator that was included with Xcode 26.0.1. My Mac is running macOS 26. When I started the simulator for the first time I got the "Ready for Apple Intelligence" notification but when I access Image Playground in my app it says it is not available on this iPhone. Any solution to get it working on the simulator?
Hello everyone, I have a visual convolutional model and a video that has been decoded into many frames. When I perform inference on each frame in a loop, the speed is a bit slow. So, I started 4 threads, each running inference simultaneously, but I found that the speed is the same as serial inference, every single forward inference is slower. I used the mactop tool to check the GPU utilization, and it was only around 20%. Is this normal? How can I accelerate it?
I am excited to try Foundation Models during WWDC, but it doesn't work at all for me. When running on my iPad Pro M4 with iPadOS 26 seed 1, I get the following error even when running the simplest query:
let prompt = "How are you?"
let stream = session.streamResponse(to: prompt)
for try await partial in stream {
self.answer = partial
self.resultString = partial
}
In the Xcode console, I see the following error:
assetsUnavailable(FoundationModels.LanguageModelSession.GenerationError.Context(debugDescription: "Model is unavailable", underlyingErrors: []))
I have verified that Apple Intelligence is enabled on my iPad. Any tips on how can I get it working? I have also submitted this feedback: FB17896752
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
Hello, I'm using videotoolbox superresolution API in MACOS 26: https://developer.apple.com/documentation/videotoolbox/vtsuperresolutionscalerconfiguration/downloadconfigurationmodel(completionhandler:)?language=objc, when using swift, it's ok, when using objective-c, I get error when downloading model with downloadConfigurationModelWithCompletionHandler:
[Auto] MA-auto{_failedLockContent} | failure reported by server | error:[com.apple.MobileAssetError.AutoAsset:MissingReference(6111)]
[Auto] MA-auto{_failedLockContent} | failure reported by server | error:[com.apple.MobileAssetError.AutoAsset:UnderlyingError(6107)_1_com.apple.MobileAssetError.Download:47]
Download completion handler called with error: The operation couldnxe2x80x99t be completed. (VTFrameProcessorErrorDomain error -19743.)
Seeing this error from time to time:
Context(debugDescription: "Content contains 4089 tokens, which exceeds the maximum allowed context size of 4096.", underlyingErrors: [])
Of course, 4089 is less than 4096 so what is this telling me and how do I work around it? Is the limit actually lower than 4096?
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
I just recently updated to iOS 26 beta (23A5336a) to test an app I am developing
I running an MLModel loaded from a .mlmodelc file.
On the current iOS version 18.6.2 the model is running as expected with no issues.
However on iOS 26 I am now getting error when trying to perform an inference to the model where I pass a camera frame into it.
Below is the error I am seeing when I attempt to run an inference.
at the bottom it says "Failed with status=0x1d : statusType=0x9: Program Inference error status=-1 Unable to compute the prediction using a neural network model. It can be an invalid input data or broken/unsupported model " does this indicate I need to convert my model or something? I don't understand since it runs as normal on iOS 18.
Any help getting this to run again would be greatly appreciated.
Thank you,
processRequest:model:qos:qIndex:modelStringID:options:returnValue:error:: Could not process request ret=0x1d lModel=_ANEModel: { modelURL=file:///var/containers/Bundle/Application/04F01BF5-D48B-44EC-A5F6-3C7389CF4856/RizzCanvas.app/faceParsing.mlmodelc/ : sourceURL=(null) : UUID=46228BFC-19B0-45BF-B18D-4A2942EEC144 : key={"isegment":0,"inputs":{"input":{"shape":[512,512,1,3,1]}},"outputs":{"var_633":{"shape":[512,512,1,19,1]},"94_argmax_out_value":{"shape":[512,512,1,1,1]},"argmax_out":{"shape":[512,512,1,1,1]},"var_637":{"shape":[512,512,1,19,1]}}} : identifierSource=1 : cacheURLIdentifier=01EF2D3DDB9BA8FD1FDE18C7CCDABA1D78C6BD02DC421D37D4E4A9D34B9F8181_93D03B87030C23427646D13E326EC55368695C3F61B2D32264CFC33E02FFD9FF : string_id=0x00000000 : program=_ANEProgramForEvaluation: { programHandle=259022032430 : intermediateBufferHandle=13949 : queueDepth=127 } : state=3 :
[Espresso::ANERuntimeEngine::__forward_segment 0] evaluate[RealTime]WithModel returned 0; code=8 err=Error Domain=com.apple.appleneuralengine Code=8 "processRequest:model:qos:qIndex:modelStringID:options:returnValue:error:: ANEProgramProcessRequestDirect() Failed with status=0x1d : statusType=0x9: Program Inference error" UserInfo={NSLocalizedDescription=processRequest:model:qos:qIndex:modelStringID:options:returnValue:error:: ANEProgramProcessRequestDirect() Failed with status=0x1d : statusType=0x9: Program Inference error}
[Espresso::handle_ex_plan] exception=Espresso exception: "Generic error": ANEF error: /private/var/containers/Bundle/Application/04F01BF5-D48B-44EC-A5F6-3C7389CF4856/RizzCanvas.app/faceParsing.mlmodelc/model.espresso.net, processRequest:model:qos:qIndex:modelStringID:options:returnValue:error:: ANEProgramProcessRequestDirect() Failed with status=0x1d : statusType=0x9: Program Inference error status=-1
Unable to compute the prediction using a neural network model. It can be an invalid input data or broken/unsupported model (error code: -1).
Error Domain=com.apple.Vision Code=3 "The VNCoreMLTransform request failed" UserInfo={NSLocalizedDescription=The VNCoreMLTransform request failed, NSUnderlyingError=0x114d92940 {Error Domain=com.apple.CoreML Code=0 "Unable to compute the prediction using a neural network model. It can be an invalid input data or broken/unsupported model (error code: -1)." UserInfo={NSLocalizedDescription=Unable to compute the prediction using a neural network model. It can be an invalid input data or broken/unsupported model (error code: -1).}}}
The developer tutorial for visual intelligence indicates that the method to detect and handle taps on a displayed entity from the Search section is via an "OpenIntent" associated with your entity.
However, running this intent executes code from within my app. If I have the perform() method display UI, it always displays UI from within my app.
I noticed that the Google app's integration to visual intelligence has a different behavior-- tapping on an entity does not take you to the Google app -- instead, a Webview is presented sheet-style WITHIN the Visual Intelligence environment (see below)
How is that accomplished?
Topic:
Machine Learning & AI
SubTopic:
Apple Intelligence
Hardware: Macbook Pro M4 Nov 2024
Software: macOS Tahoe 26.0 & xcode 26.0
Apple Intelligence is activated and the Image playground macOS app works
Running the following on xcode throws ImagePlayground.ImageCreator.Error.creationFailed
Any suggestions on how to make this work?
import Foundation
import ImagePlayground
Task {
let creator = try await ImageCreator()
guard let style = creator.availableStyles.first else {
print("No styles available")
exit(1)
}
let images = creator.images(
for: [.text("A cat wearing mittens.")],
style: style,
limit: 1)
for try await image in images {
print("Generated image: \(image)")
}
exit(0)
}
RunLoop.main.run()
Topic:
Machine Learning & AI
SubTopic:
Apple Intelligence
Hi,
I'm trying to use the new RecognizeDocumentsRequest from the Vision Framework to read a receipt. It looks very promising by being able to read paragraphs, lines and detect data. So far it unfortunately seems to read every line on the receipt as a paragraph and when there is more space on one line it creates two paragraphs.
Is there perhaps an Apple Engineer who knows if this is expected behaviour or if I should file a Feedback for this?
Code setup:
let request = RecognizeDocumentsRequest()
let observations = try await request.perform(on: image)
guard let document = observations.first?.document else {
return
}
for paragraph in document.paragraphs {
print(paragraph.transcript)
for data in paragraph.detectedData {
switch data.match.details {
case .phoneNumber(let data):
print("Phone: \(data)")
case .postalAddress(let data):
print("Postal: \(data)")
case .calendarEvent(let data):
print("Calendar: \(data)")
case .moneyAmount(let data):
print("Money: \(data)")
case .measurement(let data):
print("Measurement: \(data)")
default:
continue
}
}
}
See attached image as an example of a receipt I'd like to parse. The top 3 lines are the name, street, and postal code + city. These are all separate paragraphs. Checking on detectedData does see the street (2nd line) as PostalAddress, but not the complete address. Might that be a location thing since it's a Dutch address.
And lower on the receipt it sees the block with "Pomp 1 95 Ongelood" and the things below also as separate paragraphs. First picking up the left side and after that the right side. So it's something like this:
*
Pomp 1
Volume
Prijs
€
TOTAAL
*
BTW
Netto
21.00 %
95 Ongelood
41,90 l
1.949/ 1
81.66
€
14.17
67.49
Is the face and body detection service in the Vision framework a local model or a cloud model?
https://developer.apple.com/documentation/vision