Hi,
I'm testing DockKit with a very simple setup:
I use VNDetectFaceRectanglesRequest to detect a face and then call dockAccessory.track(...) using the detected bounding box.
The stand is correctly docked (state == .docked) and dockAccessory is valid.
I'm calling .track(...) with a single observation and valid CameraInformation (including size, device, orientation, etc.). No errors are thrown.
To monitor this, I added a logging utility – track(...) is being called 10–30 times per second, as recommended in the documentation.
However: the stand does not move at all.
There is no visible reaction to the tracking calls.
Is there anything I'm missing or doing wrong?
Is VNDetectFaceRectanglesRequest supported for DockKit tracking, or are there hidden requirements?
Would really appreciate any help or pointers – thanks!
That's my complete code:
extension VideoFeedViewController: AVCaptureVideoDataOutputSampleBufferDelegate {
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
guard let frame = CMSampleBufferGetImageBuffer(sampleBuffer) else {
return
}
detectFace(image: frame)
func detectFace(image: CVPixelBuffer) {
let faceDetectionRequest = VNDetectFaceRectanglesRequest() { vnRequest, error in
guard let results = vnRequest.results as? [VNFaceObservation] else {
return
}
guard let observation = results.first else {
return
}
let boundingBoxHeight = observation.boundingBox.size.height * 100
#if canImport(DockKit)
if let dockAccessory = self.dockAccessory {
Task {
try? await trackRider(
observation.boundingBox,
dockAccessory,
frame,
sampleBuffer
)
}
}
#endif
}
let imageResultHandler = VNImageRequestHandler(cvPixelBuffer: image, orientation: .up)
try? imageResultHandler.perform([faceDetectionRequest])
func combineBoundingBoxes(_ box1: CGRect, _ box2: CGRect) -> CGRect {
let minX = min(box1.minX, box2.minX)
let minY = min(box1.minY, box2.minY)
let maxX = max(box1.maxX, box2.maxX)
let maxY = max(box1.maxY, box2.maxY)
let combinedWidth = maxX - minX
let combinedHeight = maxY - minY
return CGRect(x: minX, y: minY, width: combinedWidth, height: combinedHeight)
}
#if canImport(DockKit)
func trackObservation(_ boundingBox: CGRect, _ dockAccessory: DockAccessory, _ pixelBuffer: CVPixelBuffer, _ cmSampelBuffer: CMSampleBuffer) throws {
// Zähle den Aufruf
TrackMonitor.shared.trackCalled()
let invertedBoundingBox = CGRect(
x: boundingBox.origin.x,
y: 1.0 - boundingBox.origin.y - boundingBox.height,
width: boundingBox.width,
height: boundingBox.height
)
guard let device = captureDevice else {
fatalError("Kamera nicht verfügbar")
}
let size = CGSize(width: Double(CVPixelBufferGetWidth(pixelBuffer)),
height: Double(CVPixelBufferGetHeight(pixelBuffer)))
var cameraIntrinsics: matrix_float3x3? = nil
if let cameraIntrinsicsUnwrapped = CMGetAttachment(
sampleBuffer,
key: kCMSampleBufferAttachmentKey_CameraIntrinsicMatrix,
attachmentModeOut: nil
) as? Data {
cameraIntrinsics = cameraIntrinsicsUnwrapped.withUnsafeBytes { $0.load(as: matrix_float3x3.self) }
}
Task {
let orientation = getCameraOrientation()
let cameraInfo = DockAccessory.CameraInformation(
captureDevice: device.deviceType,
cameraPosition: device.position,
orientation: orientation,
cameraIntrinsics: cameraIntrinsics,
referenceDimensions: size
)
let observation = DockAccessory.Observation(
identifier: 0,
type: .object,
rect: invertedBoundingBox
)
let observations = [observation]
guard let image = CMSampleBufferGetImageBuffer(sampleBuffer) else {
print("no image")
return
}
do {
try await dockAccessory.track(observations, cameraInformation: cameraInfo)
} catch {
print(error)
}
}
}
#endif
func clearDrawings() {
boundingBoxLayer?.removeFromSuperlayer()
boundingBoxSizeLayer?.removeFromSuperlayer()
}
}
}
}
@MainActor
private func getCameraOrientation() -> DockAccessory.CameraOrientation {
switch UIDevice.current.orientation {
case .portrait:
return .portrait
case .portraitUpsideDown:
return .portraitUpsideDown
case .landscapeRight:
return .landscapeRight
case .landscapeLeft:
return .landscapeLeft
case .faceDown:
return .faceDown
case .faceUp:
return .faceUp
default:
return .corrected
}
}
Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
I just watched the October 30 MacBook Pro Announcement where they talked about on-device local LLMs for the M4s.
What developer training resources are available, where we can learn how to use custom llm models and build our Swift apps to use both Apple Intelligence and other llm models on device?
Is the guidance to follow MLX github repos, or were those experimental and now there is an approved workflow and tooling?
https://www.youtube.com/watch?v=G0cmfY7qdmY
Hey Chat,
I'm researching personality analysis using LLMs, and I'm curious about whether Apple’s AI can be allowed access to your messages, Instagram DMs, and similar communications to perform a personality analysis based on your writing style. If anyone has insights on this, I would greatly appreciate your input. Thx a ton
Hello,
I am currently developing an application that requires barcode scanning using Apple’s Vision framework (VNBarcodeSymbology). I noticed that the framework supports several GS1 DataBar symbologies, such as:
VNBarcodeSymbology.gs1DataBar
VNBarcodeSymbology.gs1DataBarExpanded
VNBarcodeSymbology.gs1DataBarLimited
However, I could not find any explicit reference to support for GS1 DataBar Stacked (both regular and expanded variants).
Could you confirm whether GS1 DataBar Stacked is currently supported in VisionKit's DataScannerViewController or VNBarcodeObservation? If not, are there any plans to include support for this symbology in a future iOS update?
This functionality is critical for my use case, as GS1 DataBar Stacked barcodes are widely used in retail, pharmaceuticals, and logistics, where space constraints prevent the use of standard GS1 DataBar formats.
I appreciate any clarification on this matter and would be happy to provide additional details if needed.
Hi everyone,
I’m currently exploring the use of Foundation models on Apple platforms to build a chatbot-style assistant within an app. While the integration part is straightforward using the new FoundationModel APIs, I’m trying to figure out how to control the assistant’s responses more tightly — particularly:
Ensuring the assistant adheres to a specific tone, context, or domain (e.g. hospitality, healthcare, etc.)
Preventing hallucinations or unrelated outputs
Constraining responses based on app-specific rules, structured data, or recent interactions
I’ve experimented with prompt, systemMessage, and few-shot examples to steer outputs, but even with carefully generated prompts, the model occasionally produces incorrect or out-of-scope responses.
Additionally, when using multiple tools, I'm unsure how best to structure the setup so the model can select the correct pathway/tool and respond appropriately. Is there a recommended approach to guiding the model's decision-making when several tools or structured contexts are involved?
Looking forward to hearing your thoughts or being pointed toward related WWDC sessions, Apple docs, or sample projects.
Due to our min iOS version, this is my first time using .xcstrings instead of .strings for AppShortcuts.
When using the migrate .strings to .xcstrings Xcode context menu option, an .xcstrings catalog is produced that, as expected, has each invocation phrase as a separate string key.
However, after compilation, the catalog changes to group all invocation phrases under the first phrase listed for each intent (see attached screenshot). It is possible to hover in blank space on the right and add more translations, but there is no 1:1 key matching requirement to the phrases on the left nor a requirement that there are the same number of keys in one language vs. another. (The lines just happen to align due to my window size.)
What does that mean, practically?
Do all sub-phrases in each language in AppShortcuts.xcstrings get processed during compilation, even if there isn't an equivalent phrase key declared in the AppShortcut (e.g., the ja translation has more phrases than the English)? (That makes some logical sense, as these phrases need not be 1:1 across languages.)
In the AppShortcut declaration, if I delete all but the top invocation phrase, does nothing change with Siri?
Is there something I'm doing incorrectly?
struct WatchShortcuts: AppShortcutsProvider {
static var appShortcuts: [AppShortcut] {
AppShortcut(
intent: QuickAddWaterIntent(),
phrases: [
"\(.applicationName) log water",
"\(.applicationName) log my water",
"Log water in \(.applicationName)",
"Log my water in \(.applicationName)",
"Log a bottle of water in \(.applicationName)",
],
shortTitle: "Log Water",
systemImageName: "drop.fill"
)
}
}
What are the major differences in review process for AI based apps vis a vis normal apps for Apple store?
I have been working on a small CV program, which uses fine-tuned U2Netp model converted by coremltools 8.3.0 from PyTorch.
It works well on my iPhone (with iOS version 18.5) and my Macbook (with MacOS version 15.3.1). But it fails to load after I upgraded Macbook to MacOS version 15.5.
I have attached console log when loading this model.
Unable to load MPSGraphExecutable from path /Users/yongzhang/Library/Caches/swiftmetal/com.apple.e5rt.e5bundlecache/24F74/E051B28C6957815C140A86134D673B5C015E79A1460E9B54B8764F659FDCE645/16FA8CF2CDE66C0C427F4B51BBA82C38ACC44A514CCA396FD7B281AAC087AB2F.bundle/H14C.bundle/main/main_mps_graph/main_mps_graph.mpsgraphpackage @ GetMPSGraphExecutable
E5RT: Unable to load MPSGraphExecutable from path /Users/yongzhang/Library/Caches/swiftmetal/com.apple.e5rt.e5bundlecache/24F74/E051B28C6957815C140A86134D673B5C015E79A1460E9B54B8764F659FDCE645/16FA8CF2CDE66C0C427F4B51BBA82C38ACC44A514CCA396FD7B281AAC087AB2F.bundle/H14C.bundle/main/main_mps_graph/main_mps_graph.mpsgraphpackage (13)
Unable to load MPSGraphExecutable from path /Users/yongzhang/Library/Caches/swiftmetal/com.apple.e5rt.e5bundlecache/24F74/E051B28C6957815C140A86134D673B5C015E79A1460E9B54B8764F659FDCE645/16FA8CF2CDE66C0C427F4B51BBA82C38ACC44A514CCA396FD7B281AAC087AB2F.bundle/H14C.bundle/main/main_mps_graph/main_mps_graph.mpsgraphpackage @ GetMPSGraphExecutable
E5RT: Unable to load MPSGraphExecutable from path /Users/yongzhang/Library/Caches/swiftmetal/com.apple.e5rt.e5bundlecache/24F74/E051B28C6957815C140A86134D673B5C015E79A1460E9B54B8764F659FDCE645/16FA8CF2CDE66C0C427F4B51BBA82C38ACC44A514CCA396FD7B281AAC087AB2F.bundle/H14C.bundle/main/main_mps_graph/main_mps_graph.mpsgraphpackage (13)
Failure translating MIL->EIR network: Espresso exception: "Network translation error": MIL->EIR translation error at /Users/yongzhang/CLionProjects/ImageSimilarity/models/compiled/u2netp.mlmodelc/model.mil:1557:12: Parameter binding for axes does not exist.
[Espresso::handle_ex_plan] exception=Espresso exception: "Network translation error": MIL->EIR translation error at /Users/yongzhang/CLionProjects/ImageSimilarity/models/compiled/u2netp.mlmodelc/model.mil:1557:12: Parameter binding for axes does not exist. status=-14
Failed to build the model execution plan using a model architecture file '/Users/yongzhang/CLionProjects/ImageSimilarity/models/compiled/u2netp.mlmodelc/model.mil' with error code: -14.
Topic:
Machine Learning & AI
SubTopic:
Create ML
The simulator should support the new Siri Ul and Apple Intelligence features, at least they should work if Apple Intelligence is enabled on the Mac itself.
Feedback ID: FB15699827
I am writing a custom package wrapping Foundation Models which provides a chain-of-thought with intermittent self-evaluation among other things. At first I was designing this package with the command line in mind, but after seeing how well it augments the models and makes them more intelligent I wanted to try and build a SwiftUI wrapper around the package.
When I started I was using synchronous generation rather than streaming, but to give the best user experience (as I've seen in the WWDC sessions) it is necessary to provide constant feedback to the user that something is happening.
I have created a super simplified example of my setup so it's easier to understand.
First, there is the Reasoning conversation item, which can be converted to an XML representation which is then fed back into the model (I've found XML works best for structured input)
public typealias ConversationContext = XMLDocument
extension ConversationContext {
public func toPlainText() -> String {
return xmlString(options: [.nodePrettyPrint])
}
}
/// Represents a reasoning item in a conversation, which includes a title and reasoning content.
/// Reasoning items are used to provide detailed explanations or justifications for certain decisions or responses within a conversation.
@Generable(description: "A reasoning item in a conversation, containing content and a title.")
struct ConversationReasoningItem: ConversationItem {
@Guide(description: "The content of the reasoning item, which is your thinking process or explanation")
public var reasoningContent: String
@Guide(description: "A short summary of the reasoning content, digestible in an interface.")
public var title: String
@Guide(description: "Indicates whether reasoning is complete")
public var done: Bool
}
extension ConversationReasoningItem: ConversationContextProvider {
public func toContext() -> ConversationContext {
// <ReasoningItem title="${title}">
// ${reasoningContent}
// </ReasoningItem>
let root = XMLElement(name: "ReasoningItem")
root.addAttribute(XMLNode.attribute(withName: "title", stringValue: title) as! XMLNode)
root.stringValue = reasoningContent
return ConversationContext(rootElement: root)
}
}
Then there is the generator, which creates a reasoning item from a user query and previously generated items:
struct ReasoningItemGenerator {
var instructions: String {
"""
<omitted for brevity>
"""
}
func generate(from input: (String, [ConversationReasoningItem])) async throws -> sending LanguageModelSession.ResponseStream<ConversationReasoningItem> {
let session = LanguageModelSession(instructions: instructions)
// build the context for the reasoning item out of the user's query and the previous reasoning items
let userQuery = "User's query: \(input.0)"
let reasoningItemsText = input.1.map { $0.toContext().toPlainText() }.joined(separator: "\n")
let context = userQuery + "\n" + reasoningItemsText
let reasoningItemResponse = try await session.streamResponse(
to: context, generating: ConversationReasoningItem.self)
return reasoningItemResponse
}
}
I'm not sure if returning LanguageModelSession.ResponseStream<ConversationReasoningItem> is the right move, I am just trying to imitate what session.streamResponse returns.
Then there is the orchestrator, which I can't figure out. It receives the streamed ConversationReasoningItems from the Generator and is responsible for streaming those to SwiftUI later and also for evaluating each reasoning item after it is complete to see if it needs to be regenerated (to keep the model on-track). I want the users of the orchestrator to receive partially generated reasoning items as they are being generated by the generator. Later, when they finish, if the evaluation passes, the item is kept, but if it fails, the reasoning item should be removed from the stream before a new one is generated. So in-flight reasoning items should be outputted aggresively.
I really am having trouble figuring this out so if someone with more knowledge about asynchronous stuff in Swift, or- even better- someone who has worked on the Foundation Models framework could point me in the right direction, that would be awesome!
I was working on my project and when I tried to train a model the kernel crashed, so I restarted the kernel and tried the same and still I got the same crashing issue. Then I read one of the thread having the same issue where the apple support was saying to install tensorflow-macos and tensorflow-metal and read the guide from this site:
https://developer.apple.com/metal/tensorflow-plugin/
and I did so, I tried every single thing and when I tried the test code provided in the site, I got the same error, here's the code and the output.
Code:
import tensorflow as tf
cifar = tf.keras.datasets.cifar100
(x_train, y_train), (x_test, y_test) = cifar.load_data()
model = tf.keras.applications.ResNet50(
include_top=True,
weights=None,
input_shape=(32, 32, 3),
classes=100,)
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False)
model.compile(optimizer="adam", loss=loss_fn, metrics=["accuracy"])
model.fit(x_train, y_train, epochs=5, batch_size=64)
and here's the output:
Epoch 1/5
The Kernel crashed while executing code in the current cell or a previous cell.
Please review the code in the cell(s) to identify a possible cause of the failure.
Click here for more info.
View Jupyter log for further details.
And here's the half of log file as it was not fully coming:
metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M1
2024-10-06 23:30:49.894405: I metal_plugin/src/device/metal_device.cc:296] systemMemory: 8.00 GB
2024-10-06 23:30:49.894420: I metal_plugin/src/device/metal_device.cc:313] maxCacheSize: 2.67 GB
2024-10-06 23:30:49.894444: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2024-10-06 23:30:49.894460: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: )
2024-10-06 23:30:56.701461: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:117] Plugin optimizer for device_type GPU is enabled.
[libprotobuf FATAL google/protobuf/message_lite.cc:353] CHECK failed: target + size == res:
libc++abi: terminating due to uncaught exception of type google::protobuf::FatalException: CHECK failed: target + size == res:
Please respond to this post as soon as possible as I am working on my project now and getting this error again n again.
Device: Apple MacBook Air M1.
When using CoreML for VAE model prediction, the prediction result shows a distorted display with no error messages. How can this issue be addressed?
Faz mais de três dias estou aguardando o app playgrounds para testar essas funcionalidades Da Apple intelligence. sem falar que atualizei o sistema e o app playgrounds não apareceu no meu iPhone 15 pro. alguém sabe me dizer onde eu baixo o app playgrounds??
I'm experimenting with the new SpeechTranscriber in macOS/iOS 26, transcribing speech from a prerecorded mp4 file. Speed and quality are amazing!
I've told the transcriber to include time indexes. Each run is always exactly one word, which can be very useful. When I look at the indexes the end of one run is always identical to the start of the next run, even if there's a pause.
I'd like to identify pauses, perhaps to generate something like phrases for subtitling. With each run of text going into the next I can't do this, other than using punctuation - which might be rather rough.
Any suggestions on detecting pauses, or getting that kind of metadata from the transcriber?
Here's a short sample, showing each run with the start, end, and characters in the run:
105.9 --> 107.04 I
107.04 --> 107.16 think
107.16 --> 108.0 more
108.0 --> 108.42 lighting
108.42 --> 108.6 is
108.6 --> 108.72 definitely
108.72 --> 109.2 needed,
109.2 --> 109.92 downtown.
109.98 --> 110.4 My
110.4 --> 110.52 only
110.52 --> 110.7 question
110.7 --> 111.06 is,
111.06 --> 111.48 poll
111.48 --> 111.78 five,
111.78 --> 111.84 that
111.84 --> 112.08 you're
112.08 --> 112.38 increasing
112.38 --> 112.5 the
112.5 --> 113.34 50,000?
113.4 --> 113.58 Where
113.58 --> 113.88 exactly
I would like to make use of create ML to classify a motion. However, it seems it requires 2 classes at least to train or test it. What should I do as I only has 1 class (the target motion).
Also, how to interpret the 'Recall' and 'F1 Score'
Topic:
Machine Learning & AI
SubTopic:
Create ML
Apple intelligence not working at all. Was working ok under 18.1 but nothing under 18.2??
Topic:
Machine Learning & AI
SubTopic:
Apple Intelligence
Where can I find an example of using this MPSGraph function? I'm trying to use it to paste an image into a larger canvas at certain coordinates.
func sliceUpdateDataTensor(
_ dataTensor: MPSGraphTensor,
update updateTensor: MPSGraphTensor,
starts: [NSNumber],
ends: [NSNumber],
strides: [NSNumber],
startMask: UInt32,
endMask: UInt32,
squeezeMask: UInt32,
name: String?
) -> MPSGraphTensor
Hi,
I'm working on training a createML object detector model; I've run into an issue that has me stumped - when I reach somewhere between 100,000 and 150,000 iterations my model will stop training and error out.
More Details:
CreateML gives me the error prompt that says it is unable to train the model please delete the model source and start from the beginning or duplicate the model and start from the beginning (slightly paraphrased)
I see the following error in the createML console (my user name and UUIDs have been redacted)
Unable to load model from file:///Users/<my user name>/Library/Caches/com.apple.dt.createml/projects/<UUID HERE>/sessions/checkpoint.sessions/<UUID Here>//training-000132500.checkpoint: Cannot open file:///Users/<my user name>/Library/Caches/com.apple.dt.createml/projects/<UUID Here>/sessions/checkpoint.sessions/<uuid here> //training-000132500.checkpoint/dir_archive.ini for read. Cannot open /Users/<my username>/Library/Caches/com.apple.dt.createml/projects/<UUID>/sessions/checkpoint.sessions/<UUID>//training-000132500.checkpoint/dir_archive.ini for reading
I've gone into my Caches in my Library directory and I see each piece of the file path in finder UNTIL the //training-00132500 piece of the path, so I can at least confirm that createML appears to be unable to create or open the file it needs for this training session.
Technology Used:
Xcode 16
Apple M1 Pro
MacOS 14.6.1 (23G93)
I've also verified that Xcode and terminal have full disk permissions in my system preferences - I didn't see an option to add CreateML to this list.
I've also ensured that my createML project and its data sources are not in iCloud and are indeed local on my desktop.
Lastly, I made more space on my machine, so I should have a little over 1 TB of space.
Has anybody experienced this before? Any advice? I am majorly blocked on this issue, so I hope somebody else can help shed some light on this issue!
Thanks!
Topic:
Machine Learning & AI
SubTopic:
Create ML
Hi everyone,
Could someone confirm if it's currently possible, or if there are any plans, to restrict users from enabling Apple Intelligence altogether?
I understand that we can block individual features using MDM, but I'm interested in knowing if we can prevent users from toggling Apple Intelligence on and off in System Settings entirely.
Thanks!
Kind Regards,
Filipe Nogueira
Topic:
Machine Learning & AI
SubTopic:
Apple Intelligence
I followed below url for converting Llama-3.1-8B-Instruct model but always fails even i have 64GB of free space after downloading model from huggingface.
https://machinelearning.apple.com/research/core-ml-on-device-llama
Also tried with other models Llama-3.1-1B-Instruct & Llama-3.1-3B-Instruct models those are converted but while doing performance test in xcode fails for all compunits.
Is there any source code to run llama models in ios app.