Search results for

[tags:wwdc20-10111]

67 results found

Post

Replies

Boosts

Views

Activity

Can iOS capture video at 4032×3024 while running a Vision/ML model?
I am new to Swift and iOS development, and I have a question about video capture performance. Is it possible to capture video at a resolution of 4032×3024 while simultaneously running a vision/ML model on the video stream (e.g., using Vision or CoreML)? I want to know: whether iOS devices support capturing video at that resolution, whether the frame rate drops significantly at that scale, and whether it is practical to run a Vision/ML model in real-time while recording at such a high resolution. If anyone has experience with high-resolution AVCaptureSession setups or combining them with real-time ML processing, I would really appreciate guidance or sample code.
1
0
159
Dec ’25
Foundational Model - Image as Input? Timeline
Hi all, I am interested in unlocking unique applications with the new foundational models. I have a few questions regarding the availability of the following features: Image Input: The update in June 2025 mentions image 44 times (https://machinelearning.apple.com/research/apple-foundation-models-2025-updates) - however I can't seem to find any information about having images as the input/prompt for the foundational models. When will this be available? I understand that there are existing Vision ML APIs, but I want image input into a multimodal on-device LLM (VLM) instead for features like Which player is holding the ball in the image, etc (image understanding) Cloud Foundational Model - when will this be available? Thanks! Clement :)
1
0
520
Sep ’25
Vision Framework - Testing RecognizeDocumentsRequest
How do I test the new RecognizeDocumentRequest API. Reference: https://www.youtube.com/watch?v=H-GCNsXdKzM I am running Xcode Beta, however I only have one primary device that I cannot install beta software on. Please provide a strategy for testing. Will simulator work? The new capability is critical to my application, just what I need for structuring document scans and extraction. Thank you.
1
0
216
Jun ’25
Filtering Contours from Vision
Hello, I need help I desire to select/filter the contours on an image. Not sure best way to do that. Idea select/filter for bottom left most contour? see image attached please. also will need end points or court corners. and need contour to be fine line, smooth, ie accurate of the court end line and side lines only is desired. thank you :) or also glad for other ideas or api to determine the lines/corners I need. glad to email to discuss if that is better/easier actually prefer that. thanks.
3
0
515
Jan ’25
VNCoreMLRequest Callback Not Triggered in Modified Video Classification App
Hi everyone, I'm working on integrating object recognition from live video feeds into my existing app by following Apple's sample code. My original project captures video and records it successfully. However, after integrating the Vision-based object detection components (VNCoreMLRequest), no detections occur, and the callback for the request is never triggered. To debug this issue, I’ve added the following functionality: Set up AVCaptureVideoDataOutput for processing video frames. Created a VNCoreMLRequest using my Core ML model. The video recording functionality works as expected, but no object detection happens. I’d like to know: How to debug this further? Which key debug points or logs could help identify where the issue lies? Have I missed any key configurations? Below is a diff of the modifications I’ve made to my project for the new feature. Diff of Changes: (Attach the diff provided above) Specific Observations: The captureOutput method is invoked correctly, but there is no output or error from the Vi
1
0
641
Nov ’24
The Vision request does not work in simulator with Error "Could not create inference context"
When I use VNGenerateForegroundInstanceMaskRequest to generate the mask in the simulator by SwiftUI, there is an error Could not create inference context. Then I add the code to make the vision by CPU: let request = VNGenerateForegroundInstanceMaskRequest() let handler = VNImageRequestHandler(ciImage: inputImage) #if targetEnvironment(simulator) if #available(iOS 18.0, *) { let allDevices = MLComputeDevice.allComputeDevices for device in allDevices { if(device.description.contains(MLCPUComputeDevice)){ request.setComputeDevice(.some(device), for: .main) break } } } else { // Fallback on earlier versions request.usesCPUOnly = true } #endif do { try handler.perform([request]) if let result = request.results?.first { let mask = try result.generateScaledMaskForImage(forInstances: result.allInstances, from: handler) return CIImage(cvPixelBuffer: mask) } } catch { print(error) } Even I force the simulator to run the code by CPU, but it still have the error: Could not create inference context
2
0
1.1k
Sep ’24
Can you match a new photo with existing images?
I'm looking for a solution to take a picture or point the camera at a piece of clothing and match that image with an image the user has stored in my app. I'm storing the data in a Core Data database as a Binary Data object. Since the user also takes the pictures they store in the database I think I cannot use pre-trained Core ML models. I would like the matching to be done on device if possible instead of going to an external service. That will probably describe the item based on what the AI sees, but then I cannot match the item with the stored images in the app. Does anyone know if this is possible with frameworks as Vision or VisionKit?
2
0
1.2k
Jul ’24
Swift Student Challenge Vision
Hi Developers, I want to create a Vision app on Swift Playgrounds on iPad. However, Vision does not properly function on Swift Playgrounds on iPad or Xcode Playgrounds. The Vision code only works on a normal Xcode Project. SO can I submit my Swift Student Challenge 2024 Application as a normal Xcode Project rather than Xcode Playgrounds or Swift Playgrounds File. Thanks :)
7
0
1.8k
Feb ’24
Vision Pro & Vision SDK
I'm exploring my Vision Pro and finding it unclear whether I can even achieve things like body pose detection etc. https://developer.apple.com/videos/play/wwdc2023/111241/ It's clear that I can apply it to self provided images, but how about to the data coming from visionOS SDKs? All I can find is this mesh data from ARKit, https://developer.apple.com/documentation/arkit/arkit_in_visionos - am I missing something or do we not yet have good APIs for this? Appreciate any guidance! Thanks.
3
0
1.9k
Feb ’24
Poor Performance of Sample code from "Counting human body action repetitions in a live video feed"
I downloaded the sample code from the WWDC 2022 session Counting human body action repetitions in a live video feed and ran it on my new iPhone SE (which has an A15 Bionic chip). Unfortunately, this sample project (whose action repetition counter was mentioned multiple times during WWDC, was extremely inconsistent in tracking reps. It rarely worked for me, which was disappointing because I was really excited about this functionality. I'd like to use this action repetition counting in an app of my own, it would be very useful if it worked, but I'm skeptical after struggling to get Apple's sample app to accurately count reps. Does anyone have any suggestions for getting this sample project or action repetition counting in general, to accurately work? Any help would be really appreciated, thanks!
2
0
2.3k
Jan ’23
Why is my ml model returning VNCoreMLFeatureValueObservation instead VNRecognizedObjectObservation?
I'm training a machine learning model in PyTorch using YOLOv5 from Ultralytics. CoreMLTools from Apple is used to convert the PyTorch (.pt) model into a CoreML model (.mlmodel). This works fine, and I can use it in my iOS App, but I have to access the prediction output of the Model manually. The output shape of the model is MultiArray : Float32 1 × 25500 × 46 array. From the VNCoreMLRequest I receive only VNCoreMLFeatureValueObservation from this I can get the MultiArray and iterate through it, to find the data I need. But I see that Apple offers for Object Detection models VNRecognizedObjectObservation type, which is not returned for my model. What is the reason why my model is not supported to return the VNRecognizedObjectObservation type? Can I use CoreMLTools to enable it?
2
0
2.0k
Dec ’22
draw vision barcodes
Hi, I have seen this video: https://developer.apple.com/videos/play/wwdc2021/10041/ and in my project i am trying to draw detected barcodes. I am using Vision framework and i have the barcode position in boundingBox parameter, but i dont understand cgrect of that parameter. I am programming in objective c and i don't see resources, and for more complication i have not an image, i am capturing barcodes from video camera sesion. for parts: 1-how can i draw detected barcode like in the video (from an image). 2-how can i draw detected barcode in capturesession. I have used VNImageRectForNormalizedRect to pass from normalized to pixel, but the result is not correct. thank you very much.
2
0
1.2k
Nov ’21