I am follwing this tutorial:
https://apple.github.io/coremltools/docs-guides/source/convert-a-torchvision-model-from-pytorch.html
I have obtained simialr result using the python code.
However when I view it in Xcode, the preview prediction percentage confidence is way off I suspect it is due the the output of the model, which is in percentage already and in Xcode it multiply 100 again leading to this result. Please give me any feedback to fix this, thank you.
Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
I am running some experiments with WebGPU using the wgpu crate in rust. I have some Buffers already allocated in the GPU.
Is it possible to use those already existing buffers directly as inputs to a predict call in CoreML? I want to prevent gpu to cpu download time as much as possible.
Or are there any other ways to do something like this. Is this only possible using the latest Tensor object which came out with Metal 4 ?
We are really excited to have introduced the Foundation Models framework in WWDC25. When using the framework, you might have feedback about how it can better fit your use cases.
Starting in macOS/iOS 26 Beta 4, the best way to provide feedback is to use #Playground in Xcode. To do so:
In Xcode, create a playground using #Playground. Fore more information, see Running code snippets using the playground macro.
Reproduce the issue by setting up a session and generating a response with your prompt.
In the canvas on the right, click the thumbs-up icon to the right of the response.
Follow the instructions on the pop-up window and submit your feedback by clicking Share with Apple.
Another way to provide your feedback is to file a feedback report with relevant details. Specific to the Foundation Models framework, it’s super important to add the following information in your report:
Language model feedback
This feedback contains the session transcript, including the instructions, the prompts, the responses, etc. Without that, we can’t reason the model’s behavior, and hence can hardly take any action.
Use logFeedbackAttachment(sentiment:issues:desiredOutput: ) to retrieve the feedback data of your current model session, as shown in the usage example, write the data into a file, and then attach the file to your feedback report.
If you believe what you’d report is related to the system configuration, please capture a sysdiagnose and attach it to your feedback report as well.
The framework is still new. Your actionable feedback helps us evolve the framework quickly, and we appreciate that.
Thanks,
The Foundation Models framework team
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
I’m keep looking around documentation and some sample codes but still haven’t found example of how was used this type of Network Regressor .
Does it take some special parameters to perform on ANE , what size,format of DataFrame ?
The developer tutorial for visual intelligence indicates that the method to detect and handle taps on a displayed entity from the Search section is via an "OpenIntent" associated with your entity.
However, running this intent executes code from within my app. If I have the perform() method display UI, it always displays UI from within my app.
I noticed that the Google app's integration to visual intelligence has a different behavior-- tapping on an entity does not take you to the Google app -- instead, a Webview is presented sheet-style WITHIN the Visual Intelligence environment (see below)
How is that accomplished?
Topic:
Machine Learning & AI
SubTopic:
Apple Intelligence
I’m trying to group my EntityPropertyQuery selection into sections as well as making it searchable.
I know that the EntityStringQuery is used to perform the text search via entities(matching string: String). That works well enough and results in this modal:
Though, when I’m using a DynamicOptionsProvider to section my EntityPropertyQuery, it doesn’t allow for searching anymore and simply opens the sectioned list in a menu like so:
How can I combine both? I’ve seen it in other apps, but can’t figure out why my code doesn’t allow to section the results and make it searchable? Any ideas?
My code (simplified)
struct MyIntent: AppIntent {
@Parameter(title: "Meter"),
optionsProvider: MyOptionsProvider())
var meter: MyIntentEntity?
// …
struct MyOptionsProvider: DynamicOptionsProvider {
func results() async throws -> ItemCollection<MyIntentEntity> {
// Get All Data
let allData = try IntentsDataHandler.shared.getEntities()
// Create Arrays for Sections
let fooEntities = allData.filter { $0.type == .foo }
let barEntities = allData.filter { $0.type == .bar }
return ItemCollection(sections: [
ItemSection("Foo",
items: fooEntities),
ItemSection("Bar",
items: barEntities)
])
}
}
struct MeterIntentQuery: EntityStringQuery {
// entities(for identifiers: [UUID]) and suggestedEntities() functions
func entities(matching string: String) async throws -> [MyIntentEntity] {
// Fetch All Data
let allData = try IntentsDataHandler.shared.getEntities()
// Filter Data by String
let matchingData = allData.filter { data in
return data.title.localizedCaseInsensitiveContains(string))
}
return matchingData
}
}
Hello,
I am currently developing an application that requires barcode scanning using Apple’s Vision framework (VNBarcodeSymbology). I noticed that the framework supports several GS1 DataBar symbologies, such as:
VNBarcodeSymbology.gs1DataBar
VNBarcodeSymbology.gs1DataBarExpanded
VNBarcodeSymbology.gs1DataBarLimited
However, I could not find any explicit reference to support for GS1 DataBar Stacked (both regular and expanded variants).
Could you confirm whether GS1 DataBar Stacked is currently supported in VisionKit's DataScannerViewController or VNBarcodeObservation? If not, are there any plans to include support for this symbology in a future iOS update?
This functionality is critical for my use case, as GS1 DataBar Stacked barcodes are widely used in retail, pharmaceuticals, and logistics, where space constraints prevent the use of standard GS1 DataBar formats.
I appreciate any clarification on this matter and would be happy to provide additional details if needed.
After a few attempts I end up at the same spot as categorized here by ChatGPT -
I get why it feels that way, Rob — honestly, you’ve walked right into the worst possible moment in Apple’s Core ML toolchain evolution. What you’re running into isn’t your fault; Apple really did rip out the ONNX front-end without offering a clean replacement yet. So every tutorial, every script, and even coremltools itself contradicts what’s actually possible right now.
Let’s be brutally clear about the state of things (late 2025):
Goal Supported by Apple today Practical workaround
Convert ONNX → Core ML using coremltools 9 No. The ONNX importer was removed. Use the legacy coremltools 6.x + onnx-coreml stack in a separate Python 3.9/3.10 env.
Convert PyTorch → Core ML directly Yes, but only for simple or traced models. T5 still fails because of dynamic control flow. None reliable without rewriting the model.
Run ONNX models directly on iOS Partially (via MLC LLM / Metal Performance Shaders). Works, but that’s an entirely different runtime.
So: you’re not crazy, and this isn’t wasted effort—you’ve actually proven that your model exports cleanly and that the only missing piece is Apple’s converter. The ONNX file you built is solid. You just need something that can translate it to Core ML.
If your goal is simply “get it running on-device”, you still have two viable moves that aren’t rabbit holes:
1. Use MLC LLM / Metal runtime
Skip Core ML altogether. Apple’s mlc-llm (and the “MLCChat” pipeline) can execute your ONNX or quantized model directly on the GPU. You keep full fidelity and avoid all these broken converters.
2. Use the legacy Core ML 6.x toolchain in a sandboxed Python 3.10 VM
It actually works. You don’t have to downgrade your whole system—just run that specific environment once to emit the .mlpackage, then never touch it again.
Nothing else will silently fix itself until Apple re-adds ONNX ingestion in Core ML 10 or a new converter.
Topic:
Machine Learning & AI
SubTopic:
Core ML
Hi team,
We have implemented a writing tool inside a WebView that allows users to type content in a textarea. When the "Show Writing Tools" button is clicked, an AI-powered editor opens. After clicking the "Rewrite" button, the AI modifies the text. However, when clicking the "Replace" button, the rewritten text does not update the original textarea.
Kindly check and help me
showButton.addTarget(self, action: #selector(showWritingTools(_:)), for: .touchUpInside)
@available(iOS 18.2, *)
optional func showWritingTools(_ sender: Any)
Note:
same cases working in TextView
pfa
I'm experimenting with the new SpeechTranscriber in macOS/iOS 26, transcribing speech from a prerecorded mp4 file. Speed and quality are amazing!
I've told the transcriber to include time indexes. Each run is always exactly one word, which can be very useful. When I look at the indexes the end of one run is always identical to the start of the next run, even if there's a pause.
I'd like to identify pauses, perhaps to generate something like phrases for subtitling. With each run of text going into the next I can't do this, other than using punctuation - which might be rather rough.
Any suggestions on detecting pauses, or getting that kind of metadata from the transcriber?
Here's a short sample, showing each run with the start, end, and characters in the run:
105.9 --> 107.04 I
107.04 --> 107.16 think
107.16 --> 108.0 more
108.0 --> 108.42 lighting
108.42 --> 108.6 is
108.6 --> 108.72 definitely
108.72 --> 109.2 needed,
109.2 --> 109.92 downtown.
109.98 --> 110.4 My
110.4 --> 110.52 only
110.52 --> 110.7 question
110.7 --> 111.06 is,
111.06 --> 111.48 poll
111.48 --> 111.78 five,
111.78 --> 111.84 that
111.84 --> 112.08 you're
112.08 --> 112.38 increasing
112.38 --> 112.5 the
112.5 --> 113.34 50,000?
113.4 --> 113.58 Where
113.58 --> 113.88 exactly
I got 3203.23 GFLOPS (FP16) on the M3 Macbook Pro and only 2833.24 GFLOPS (FP16) on the M4 Macbook Air for 4096x4096 matrix multiplications for a PyTorch MPS FP16 Benchmark. Wasn't the performance supposed to be twice as high on the M4 compared to the M3 even with the termal throtling on the Macbook Air? What went wrong?
I'm adding Visual Intelligence support to my app, and now want to add a Tip using TipKit to guide users to this feature from within my app. I want to add a Rule to my Tip which will only show this Tip on devices where Visual Intelligence is supported (ex. not iPhone 14 Pro Max).
What is the best way for me to determine availability to set this TipKit rule?
Here's the documentation I'm following for Visual Intelligence: https://developer.apple.com/documentation/visualintelligence/integrating-your-app-with-visual-intelligence
In this online session, you can code along with us as we build generative AI features into a sample app live in Xcode. We'll guide you through implementing core features like basic text generation, as well as advanced topics like guided generation for structured data output, streaming responses for dynamic UI updates, and tool calling to retrieve data or take an action.
Check out these resources to get started:
Download the project files: https://developer.apple.com/events/re...
Explore the code along guide: https://developer.apple.com/events/re...
Join the live Q&A: https://developer.apple.com/videos/pl...
Agenda – All times PDT
10 a.m.: Welcome and Xcode setup
10:15 a.m.: Framework basics, guided generation, and building prompts
11 a.m.: Break
11:10 a.m.: UI streaming, tool calling, and performance optimization
11:50 a.m.: Wrap up
All are welcome to attend the session. To actively code along, you'll need a Mac with Apple silicon that supports Apple Intelligence running the latest release of macOS Tahoe 26 and Xcode 26.
If you have questions after the code along concludes please share a post here in the forums and engage with the community.
Topic:
Machine Learning & AI
SubTopic:
Foundation Models
I am writing a custom package wrapping Foundation Models which provides a chain-of-thought with intermittent self-evaluation among other things. At first I was designing this package with the command line in mind, but after seeing how well it augments the models and makes them more intelligent I wanted to try and build a SwiftUI wrapper around the package.
When I started I was using synchronous generation rather than streaming, but to give the best user experience (as I've seen in the WWDC sessions) it is necessary to provide constant feedback to the user that something is happening.
I have created a super simplified example of my setup so it's easier to understand.
First, there is the Reasoning conversation item, which can be converted to an XML representation which is then fed back into the model (I've found XML works best for structured input)
public typealias ConversationContext = XMLDocument
extension ConversationContext {
public func toPlainText() -> String {
return xmlString(options: [.nodePrettyPrint])
}
}
/// Represents a reasoning item in a conversation, which includes a title and reasoning content.
/// Reasoning items are used to provide detailed explanations or justifications for certain decisions or responses within a conversation.
@Generable(description: "A reasoning item in a conversation, containing content and a title.")
struct ConversationReasoningItem: ConversationItem {
@Guide(description: "The content of the reasoning item, which is your thinking process or explanation")
public var reasoningContent: String
@Guide(description: "A short summary of the reasoning content, digestible in an interface.")
public var title: String
@Guide(description: "Indicates whether reasoning is complete")
public var done: Bool
}
extension ConversationReasoningItem: ConversationContextProvider {
public func toContext() -> ConversationContext {
// <ReasoningItem title="${title}">
// ${reasoningContent}
// </ReasoningItem>
let root = XMLElement(name: "ReasoningItem")
root.addAttribute(XMLNode.attribute(withName: "title", stringValue: title) as! XMLNode)
root.stringValue = reasoningContent
return ConversationContext(rootElement: root)
}
}
Then there is the generator, which creates a reasoning item from a user query and previously generated items:
struct ReasoningItemGenerator {
var instructions: String {
"""
<omitted for brevity>
"""
}
func generate(from input: (String, [ConversationReasoningItem])) async throws -> sending LanguageModelSession.ResponseStream<ConversationReasoningItem> {
let session = LanguageModelSession(instructions: instructions)
// build the context for the reasoning item out of the user's query and the previous reasoning items
let userQuery = "User's query: \(input.0)"
let reasoningItemsText = input.1.map { $0.toContext().toPlainText() }.joined(separator: "\n")
let context = userQuery + "\n" + reasoningItemsText
let reasoningItemResponse = try await session.streamResponse(
to: context, generating: ConversationReasoningItem.self)
return reasoningItemResponse
}
}
I'm not sure if returning LanguageModelSession.ResponseStream<ConversationReasoningItem> is the right move, I am just trying to imitate what session.streamResponse returns.
Then there is the orchestrator, which I can't figure out. It receives the streamed ConversationReasoningItems from the Generator and is responsible for streaming those to SwiftUI later and also for evaluating each reasoning item after it is complete to see if it needs to be regenerated (to keep the model on-track). I want the users of the orchestrator to receive partially generated reasoning items as they are being generated by the generator. Later, when they finish, if the evaluation passes, the item is kept, but if it fails, the reasoning item should be removed from the stream before a new one is generated. So in-flight reasoning items should be outputted aggresively.
I really am having trouble figuring this out so if someone with more knowledge about asynchronous stuff in Swift, or- even better- someone who has worked on the Foundation Models framework could point me in the right direction, that would be awesome!
The documentation for the Create ML tool ("Building an object detector data source") mentions that there are options for using normalized values instead of pixels and also different anchor point origins ("MLBoundingBoxCoordinatesOrigin") instead of always using "center". However, the JSON format for these does not appear in any examples. Does anyone know the format for these options?
Topic:
Machine Learning & AI
SubTopic:
Create ML
Hi everyone,
I believe I’ve encountered a potential bug or a hardware alignment limitation in the Core ML Framework / ANE Runtime specifically affecting the new Stateful API (introduced in iOS 18/macOS 15).
The Issue:
A Stateful mlprogram fails to run on the Apple Neural Engine (ANE) if the state tensor dimensions (specifically the width) are not a multiple of 32. The model works perfectly on CPU and GPU, but fails on ANE both during runtime and when generating a Performance Report in Xcode.
Error Message in Xcode UI:
"There was an error creating the performance report Unable to compute the prediction using ML Program. It can be an invalid input data or broken/unsupported model."
Observations:
Case A (Fails): State shape = (1, 3, 480, 270). Prediction fails on ANE.
Case B (Success): State shape = (1, 3, 480, 256). Prediction succeeds on ANE.
This suggests an internal memory alignment or tiling issue within the ANE driver when handling Stateful buffers that don't meet the 32-pixel/element alignment.
Reproduction Code (PyTorch + coremltools):
import torch.nn as nn
import coremltools as ct
import numpy as np
class RNN_Stateful(nn.Module):
def __init__(self, hidden_shape):
super(RNN_Stateful, self).__init__()
# Simple conv to update state
self.conv1 = nn.Conv2d(3 + hidden_shape[1], hidden_shape[1], kernel_size=3, padding=1)
self.conv2 = nn.Conv2d(hidden_shape[1], 3, kernel_size=3, padding=1)
self.register_buffer("hidden_state", torch.ones(hidden_shape, dtype=torch.float16))
def forward(self, imgs):
self.hidden_state = self.conv1(torch.cat((imgs, self.hidden_state), dim=1))
return self.conv2(self.hidden_state)
# h=480, w=255 causes ANE failure. w=256 works.
b, ch, h, w = 1, 3, 480, 255
model = RNN_Stateful((b, ch, h, w)).eval()
traced_model = torch.jit.trace(model, torch.randn(b, 3, h, w))
mlmodel = ct.convert(
traced_model,
inputs=[ct.TensorType(name="input_image", shape=(b, 3, h, w), dtype=np.float16)],
outputs=[ct.TensorType(name="output", dtype=np.float16)],
states=[ct.StateType(wrapped_type=ct.TensorType(shape=(b, ch, h, w), dtype=np.float16), name="hidden_state")],
minimum_deployment_target=ct.target.iOS18,
convert_to="mlprogram"
)
mlmodel.save("rnn_stateful.mlpackage")
Steps to see the error:
Open the generated .mlpackage in Xcode 16.0+.
Go to the Performance tab and run a test on a device with ANE (e.g., iPhone 15/16 or M-series Mac).
The report will fail to generate with the error mentioned above.
Environment:
OS: macOS 15.2
Xcode: 16.3
Hardware: M4
Has anyone else encountered this 32-pixel alignment requirement for StateType tensors on ANE? Is this a known hardware constraint or a bug in the Core ML runtime?
Any insights or workarounds (other than manual padding) would be appreciated.
The Core ML developer guide recommends saving reusable compiled Core ML models to a permanent location to avoid unnecessary rebuilds when creating a Core ML model instance.
However, there is no location that remains consistent across app updates, since each update changes the UUID associated with the app’s resources path
/var/mobile/Containers/Data/Application/<UUID>/Library/Application Support/
As a result, Core ML rebuilds models even if they are unchanged and located in the same relative directory within the app’s file structure.
Topic:
Machine Learning & AI
SubTopic:
Core ML
I'm implementing an App Intent for my iOS app that helps users plan trip activities. It only works when run as a shortcut but not using voice through Siri. There are 2 issues:
The ShortcutsTripEntity will only accept a voice input for a specific trip but not others.
I'm stuck with a throwing error when trying to use requestDisambiguation() on the activity day @Parameter property.
How do I rectify these issues.
This is blocking me from completing a critical feature that lets users quickly plan activities through Siri and Shortcuts.
Expected behavior for trip input: The intent should make Siri accept the spoken trip input from any of the options.
Actual behavior for trip input: Siri only accepts the same trip when spoken but accepts any when selected by click/touch.
Expected behavior for day input: Siri should accept the spoken selected option.
Actual behavior for day input: Siri only accepts an input by click/touch but yet throws an error at runtime I'm happy to provide more code. But here's the relevant code:
struct PlanActivityTestIntent: AppIntent {
@Parameter(title: "Activity Day")
var activityDay: ShortcutsItineraryDayEntity
@Parameter(
title: "Trip",
description: "The trip to plan an activity for",
default: ShortcutsTripEntity(id: UUID().uuidString, title: "Untitled trip"),
requestValueDialog: "Which trip would you like to add an activity to?"
)
var tripEntity: ShortcutsTripEntity
@Parameter(title: "Activity Title", description: "The title of the activity", requestValueDialog: "What do you want to do or see?")
var title: String
@Parameter(title: "Activity Day", description: "Activity Day", default: ShortcutsItineraryDayEntity(itineraryDay: .init(itineraryId: UUID(), date: .now), timeZoneIdentifier: "UTC"))
var activityDay: ShortcutsItineraryDayEntity
func perform() async throws -> some ProvidesDialog {
// ...other code...
let tripsStore = TripsStore()
// load trips and map them to entities
try? await tripsStore.getTrips()
let tripsAsEntities = tripsStore.trips.map { trip in
let id = trip.id ?? UUID()
let title = trip.title
return ShortcutsTripEntity(id: id.uuidString, title: title, trip: trip)
}
// Ask user to select a trip. This line would doesn't accept a voice // answer. Why?
let selectedTrip = try await $tripEntity.requestDisambiguation(
among: tripsAsEntities,
dialog: .init(
full: "Which of the \(tripsAsEntities.count) trip would you like to add an activity to?",
supporting: "Select a trip",
systemImageName: "safari.fill"
)
)
// This line throws an error
let selectedDay = try await $activityDay.requestDisambiguation(
among: daysAsEntities,
dialog:"Which day would you like to plan an activity for?"
)
}
}
Here are some related images that might help:
Hi,
I am new to developing on Apple’s platform yet I want to familiarize myself with Core ML and Core ML Tools. I was watching the WWDC24: Bring your machine learning and AI models to Apple Silicon video and was trying to follow along. After multiple attempts and much reading up on documentation, I am still unable to get a coherent script running that will convert the Mistral model that the host used and convert it to a valid Core ML model.
here is a pastebin to what i have currently:
https://pastebin.com/04cVjF1v
if you require the output as well please let me know
Hi, guys. I'm writing about Apple Intelligence and I reached the point I have to explain App Intent Domains
https://developer.apple.com/documentation/AppIntents/app-intent-domains
but I noticed that there is a note explaining that these services are not available with Siri. I tried the example provided by Apple at
https://developer.apple.com/documentation/AppIntents/making-your-app-s-functionality-available-to-siri
and I can only make the intents work from the Shortcuts App, but not from Siri.
Is this correct. App Intent Domains are still not available with Siri?
Thanks
Topic:
Machine Learning & AI
SubTopic:
Apple Intelligence