Search results | Apple Developer Forums

Can iOS capture video at 4032×3024 while running a Vision/ML model?

I am new to Swift and iOS development, and I have a question about video capture performance. Is it possible to capture video at a resolution of 4032×3024 while simultaneously running a vision/ML model on the video stream (e.g., using Vision or CoreML)? I want to know: whether iOS devices support capturing video at that resolution, whether the frame rate drops significantly at that scale, and whether it is practical to run a Vision/ML model in real-time while recording at such a high resolution. If anyone has experience with high-resolution AVCaptureSession setups or combining them with real-time ML processing, I would really appreciate guidance or sample code.

Media Technologies Photos & Camera AVFoundation Vision Machine Learning

1

0

159

Dec ’25

Computer Vision and Foundation Models

Is foundation models matured enough to take input from the Apple Vision framework to generate responses? Something similar to what google's gemini does although in a much smaller scale and for a very specific niche.

Machine Learning & AI Foundation Models Vision Machine Learning

1

0

474

Nov ’25

Foundational Model - Image as Input? Timeline

Hi all, I am interested in unlocking unique applications with the new foundational models. I have a few questions regarding the availability of the following features: Image Input: The update in June 2025 mentions image 44 times (https://machinelearning.apple.com/research/apple-foundation-models-2025-updates) - however I can't seem to find any information about having images as the input/prompt for the foundational models. When will this be available? I understand that there are existing Vision ML APIs, but I want image input into a multimodal on-device LLM (VLM) instead for features like Which player is holding the ball in the image, etc (image understanding) Cloud Foundational Model - when will this be available? Thanks! Clement :)

Machine Learning & AI Foundation Models Vision Machine Learning Core ML Apple Intelligence

1

0

520

Sep ’25

Vision Framework - Testing RecognizeDocumentsRequest

How do I test the new RecognizeDocumentRequest API. Reference: https://www.youtube.com/watch?v=H-GCNsXdKzM I am running Xcode Beta, however I only have one primary device that I cannot install beta software on. Please provide a strategy for testing. Will simulator work? The new capability is critical to my application, just what I need for structuring document scans and extraction. Thank you.

Machine Learning & AI General Vision Machine Learning Photos and Imaging

1

0

216

Jun ’25

Filtering Contours from Vision

Hello, I need help I desire to select/filter the contours on an image. Not sure best way to do that. Idea select/filter for bottom left most contour? see image attached please. also will need end points or court corners. and need contour to be fine line, smooth, ie accurate of the court end line and side lines only is desired. thank you :) or also glad for other ideas or api to determine the lines/corners I need. glad to email to discuss if that is better/easier actually prefer that. thanks.

Developer Tools & Services Xcode Vision Machine Learning

3

0

515

Jan ’25

VNCoreMLRequest Callback Not Triggered in Modified Video Classification App

Hi everyone, I'm working on integrating object recognition from live video feeds into my existing app by following Apple's sample code. My original project captures video and records it successfully. However, after integrating the Vision-based object detection components (VNCoreMLRequest), no detections occur, and the callback for the request is never triggered. To debug this issue, I’ve added the following functionality: Set up AVCaptureVideoDataOutput for processing video frames. Created a VNCoreMLRequest using my Core ML model. The video recording functionality works as expected, but no object detection happens. I’d like to know: How to debug this further? Which key debug points or logs could help identify where the issue lies? Have I missed any key configurations? Below is a diff of the modifications I’ve made to my project for the new feature. Diff of Changes: (Attach the diff provided above) Specific Observations: The captureOutput method is invoked correctly, but there is no output or error from the Vi

Machine Learning & AI Core ML Machine Learning Vision Concurrency

1

0

641

Nov ’24

The Vision request does not work in simulator with Error "Could not create inference context"

When I use VNGenerateForegroundInstanceMaskRequest to generate the mask in the simulator by SwiftUI, there is an error Could not create inference context. Then I add the code to make the vision by CPU: let request = VNGenerateForegroundInstanceMaskRequest() let handler = VNImageRequestHandler(ciImage: inputImage) #if targetEnvironment(simulator) if #available(iOS 18.0, *) { let allDevices = MLComputeDevice.allComputeDevices for device in allDevices { if(device.description.contains(MLCPUComputeDevice)){ request.setComputeDevice(.some(device), for: .main) break } } } else { // Fallback on earlier versions request.usesCPUOnly = true } #endif do { try handler.perform([request]) if let result = request.results?.first { let mask = try result.generateScaledMaskForImage(forInstances: result.allInstances, from: handler) return CIImage(cvPixelBuffer: mask) } } catch { print(error) } Even I force the simulator to run the code by CPU, but it still have the error: Could not create inference context

Machine Learning & AI General Vision Machine Learning

2

0

1.1k

Sep ’24

Can you match a new photo with existing images?

I'm looking for a solution to take a picture or point the camera at a piece of clothing and match that image with an image the user has stored in my app. I'm storing the data in a Core Data database as a Binary Data object. Since the user also takes the pictures they store in the database I think I cannot use pre-trained Core ML models. I would like the matching to be done on device if possible instead of going to an external service. That will probably describe the item based on what the AI sees, but then I cannot match the item with the stored images in the app. Does anyone know if this is possible with frameworks as Vision or VisionKit?

Machine Learning & AI General Vision Machine Learning VisionKit Core ML

2

0

1.2k

Jul ’24

Shortcut: Search media files for sounds

Does anyone have a ready-made script/shortcut like the one shown in the video?

Media Technologies Audio Vision Audio Machine Learning

1

0

766

Apr ’24

Swift Student Challenge Vision

Hi Developers, I want to create a Vision app on Swift Playgrounds on iPad. However, Vision does not properly function on Swift Playgrounds on iPad or Xcode Playgrounds. The Vision code only works on a normal Xcode Project. SO can I submit my Swift Student Challenge 2024 Application as a normal Xcode Project rather than Xcode Playgrounds or Swift Playgrounds File. Thanks :)

Developer Tools & Services Swift Playground Swift Playground Swift Student Challenge Vision Machine Learning

7

0

1.8k

Feb ’24

Vision Pro & Vision SDK

I'm exploring my Vision Pro and finding it unclear whether I can even achieve things like body pose detection etc. https://developer.apple.com/videos/play/wwdc2023/111241/ It's clear that I can apply it to self provided images, but how about to the data coming from visionOS SDKs? All I can find is this mesh data from ARKit, https://developer.apple.com/documentation/arkit/arkit_in_visionos - am I missing something or do we not yet have good APIs for this? Appreciate any guidance! Thanks.

Machine Learning & AI General Vision Machine Learning Core ML visionOS

3

0

1.9k

Feb ’24

Poor Performance of Sample code from "Counting human body action repetitions in a live video feed"

I downloaded the sample code from the WWDC 2022 session Counting human body action repetitions in a live video feed and ran it on my new iPhone SE (which has an A15 Bionic chip). Unfortunately, this sample project (whose action repetition counter was mentioned multiple times during WWDC, was extremely inconsistent in tracking reps. It rarely worked for me, which was disappointing because I was really excited about this functionality. I'd like to use this action repetition counting in an app of my own, it would be very useful if it worked, but I'm skeptical after struggling to get Apple's sample app to accurately count reps. Does anyone have any suggestions for getting this sample project or action repetition counting in general, to accurately work? Any help would be really appreciated, thanks!

Machine Learning & AI General Vision Machine Learning wwdc2022-110332

2

0

2.3k

Jan ’23

Why is my ml model returning VNCoreMLFeatureValueObservation instead VNRecognizedObjectObservation?

I'm training a machine learning model in PyTorch using YOLOv5 from Ultralytics. CoreMLTools from Apple is used to convert the PyTorch (.pt) model into a CoreML model (.mlmodel). This works fine, and I can use it in my iOS App, but I have to access the prediction output of the Model manually. The output shape of the model is MultiArray : Float32 1 × 25500 × 46 array. From the VNCoreMLRequest I receive only VNCoreMLFeatureValueObservation from this I can get the MultiArray and iterate through it, to find the data I need. But I see that Apple offers for Object Detection models VNRecognizedObjectObservation type, which is not returned for my model. What is the reason why my model is not supported to return the VNRecognizedObjectObservation type? Can I use CoreMLTools to enable it?

Machine Learning & AI General Vision VisionKit Machine Learning Core ML

2

0

2.0k

Dec ’22

draw vision barcodes

Hi, I have seen this video: https://developer.apple.com/videos/play/wwdc2021/10041/ and in my project i am trying to draw detected barcodes. I am using Vision framework and i have the barcode position in boundingBox parameter, but i dont understand cgrect of that parameter. I am programming in objective c and i don't see resources, and for more complication i have not an image, i am capturing barcodes from video camera sesion. for parts: 1-how can i draw detected barcode like in the video (from an image). 2-how can i draw detected barcode in capturesession. I have used VNImageRectForNormalizedRect to pass from normalized to pixel, but the result is not correct. thank you very much.

Machine Learning & AI General Vision Machine Learning wwdc21-10041

2

0

1.2k

Nov ’21

Couldn't import in tensorflow

I already installed tensorflow latest version using the documentation given (link). But when I tried to run notebook with command %tensorflow_version 2.x , its giving error UsageError: Line magic function %tensorflow_version not found.. Please tell me, what to do ?

Graphics & Games General Metal Vision Machine Learning wwdc21-10152

1

0

694

Oct ’21

[tags:wwdc20-10111]

Post

Replies

Boosts

Views

Activity