Hello,
I've been dealing with a puzzling issue for some time now, and I’m hoping someone here might have insights or suggestions.
The Problem:
We’re observing an occasional crash in our app that seems to originate from the Vision framework.
Frequency: It happens randomly, after many successful executions of the same code, hard to tell how long the app was working, but in some cases app could run for like a month without any issues.
Devices: The issue doesn't seem device-dependent (we’ve seen it on various iPad models).
OS Versions: The crashes started occurring with iOS 18.0.1 and are still present in 18.1 and 18.1.1.
What I suspected: The crash logs point to a potential data race within the Vision framework.
The relevant section of the code where the crash happens:
guard let cgImage = image.cgImage else {
throw ...
}
let request = VNCoreMLRequest(model: visionModel)
try VNImageRequestHandler(cgImage: cgImage).perform([request]) // <- the line causing the crash
Since the code is rather simple, I'm not sure what else there could be missing here.
The images sent here are uniform (fixed size).
Model is loaded and working, the crash occurs random after a period of time and the call worked correctly many times. Also, the model variable is not an optional.
Here is the crash log:
libobjc.A objc_exception_throw
CoreFoundation -[NSMutableArray removeObjectsAtIndexes:]
Vision -[VNWeakTypeWrapperCollection _enumerateObjectsDroppingWeakZeroedObjects:usingBlock:]
Vision -[VNWeakTypeWrapperCollection addObject:droppingWeakZeroedObjects:]
Vision -[VNSession initWithCachingBehavior:]
Vision -[VNCoreMLTransformer initWithOptions:model:error:]
Vision -[VNCoreMLRequest internalPerformRevision:inContext:error:]
Vision -[VNRequest performInContext:error:]
Vision -[VNRequestPerformer _performOrderedRequests:inContext:error:]
Vision -[VNRequestPerformer _performRequests:onBehalfOfRequest:inContext:error:]
Vision -[VNImageRequestHandler performRequests:gatheredForensics:error:]
OurApp ModelWrapper.perform
And I'm a bit lost at this point, I've tried everything I could image so far.
I've tried to putting a symbolic breakpoint in the removeObjectsAtIndexes to check if some library (e.g. crash reporter) we use didn't do some implementation swap. There was none, and if anything did some method swizzling, I'd expect that to show in the stack trace before the original code would be called. I did peek into the previous functions and I've noticed a lock used in one of the Vision methods, so in my understanding any data race in this code shouldn't be possible at all. I've also put breakpoints in the NSLock variants, to check for swizzling/override with a category and possibly messing the locking - again, nothing was there.
There is also another model that is running on a separate queue, but after seeing the line with the locking in the debugger, it doesn't seem to me like this could cause a problem, at least not in this specific spot.
Is there something I'm missing here, or something I'm doing wrong?
Thanks in advance for your help!
Vision
RSS for tagApply computer vision algorithms to perform a variety of tasks on input images and video using Vision.
Posts under Vision tag
80 Posts
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
Before you post —Camera doesn't work on the Simulator— that's no longer true. I've made a solution that makes the Simulator believe there's an actual hardware device connected, allowing users to stream the macOS camera to the iOS Simulator (see for more info RocketSim's documentation: https://docs.rocketsim.app/features/hzQMSrSga7BGWvxdNVdwYs/simulator-camera-support/58tQ5jvevLNSnyUEA7VgAv)
Now, it works for VNDocumentCameraViewController, but when I try opening DataScannerViewController, I directly run into:
Failed to start scanning: The operation couldn’t be completed. (VisionKit.DataScannerViewController.ScanningUnavailable error 0.)
My question:
How does this view controller determine whether scanning is available?
Is there a certain capability the available AVCaptureDevice's need to support maybe?
Any direction would be helpful for me to make this work for developers, making them build apps faster!
Issue:
In iOS 26 (tested on Developer Beta), AVCaptureMetadataOutputObjectsDelegate no longer receives callbacks when using .face detection.
metadataOutput.metadataObjectTypes = [.face]
Hi, I'm developing an application for macos and ios that has to run DetectHumanBodyPose3DRequest model in real time for retrieving the 3d skeleton from the camera.
I'm experiencing a memory leak every time the model is used (when i comment that line, the memory stays constant). After a minute it uses about 1GB of ram running with mac catalyst.
I attached a minimal project that has this problem
Code
Camera View
import SwiftUI
import Combine
import Vision
struct CameraView: View {
@StateObject private var viewModel = CameraViewModel()
var body: some View {
HStack {
ZStack {
GeometryReader { geometry in
if let image = viewModel.currentFrame {
Image(decorative: image, scale: 1)
.resizable()
.scaledToFill()
.frame(width: geometry.size.width,
height: geometry.size.height)
.clipped()
} else {
ProgressView()
}
}
}
}
}
}
class CameraViewModel: ObservableObject {
@Published var currentFrame: CGImage?
@Published var frameRate: Double = 0
@Published var currentVisionBodyPose: HumanBodyPose3DObservation? // Store current body pose
@Published var currentImageSize: CGSize? // Store current image size
private var cameraManager: CameraManager?
private var humanBodyPose = HumanBodyPose3DDetector()
private var lastClassificationTime = Date()
private var frameCount = 0
private var lastFrameTime = Date()
private let classificationThrottleInterval: TimeInterval = 1.0
private var lastPoseSendTime: Date = .distantPast
init() {
cameraManager = CameraManager()
startPreview()
startClassification()
}
private func startPreview() {
Task {
guard let previewStream = cameraManager?.previewStream else { return }
for await frame in previewStream {
let size = CGSize(width: frame.width, height: frame.height)
Task { @MainActor in
self.currentFrame = frame
self.currentImageSize = size
self.updateFrameRate()
}
}
}
}
private func startClassification() {
Task {
guard let classificationStream = cameraManager?.classificationStream else { return }
for await pixelBuffer in classificationStream {
self.classifyFrame(pixelBuffer: pixelBuffer)
}
}
}
private func classifyFrame(pixelBuffer: CVPixelBuffer) {
humanBodyPose.runHumanBodyPose3DRequestOnImage(pixelBuffer: pixelBuffer) { [weak self] observation in
guard let self = self else { return }
DispatchQueue.main.async {
if let observation = observation {
self.currentVisionBodyPose = observation
print(observation)
} else {
self.currentVisionBodyPose = nil
}
}
}
}
private func updateFrameRate() {
frameCount += 1
let now = Date()
let elapsed = now.timeIntervalSince(lastFrameTime)
if elapsed >= 1.0 {
frameRate = Double(frameCount) / elapsed
frameCount = 0
lastFrameTime = now
}
}
}
HumanBodyPose3DDetector
import Foundation
import Vision
class HumanBodyPose3DDetector: NSObject, ObservableObject {
@Published var humanObservation: HumanBodyPose3DObservation? = nil
private let queue = DispatchQueue(label: "humanbodypose.queue")
private let request = DetectHumanBodyPose3DRequest()
private struct SendablePixelBuffer: @unchecked Sendable {
let buffer: CVPixelBuffer
}
public func runHumanBodyPose3DRequestOnImage(pixelBuffer: CVPixelBuffer, completion: @escaping (HumanBodyPose3DObservation?) -> Void) {
let sendableBuffer = SendablePixelBuffer(buffer: pixelBuffer)
queue.async { [weak self] in
Task { [weak self, sendableBuffer] in
do {
guard let self = self else { return }
let result = try await self.request.perform(on: sendableBuffer.buffer)
//process result
DispatchQueue.main.async {
if result.isEmpty {
completion(nil)
} else {
completion(result[0])
}
}
} catch {
DispatchQueue.main.async {
completion(nil)
}
}
}
}
}
}
Apple Vision Pro support issue. The app contains the following UIRequiredDeviceCapabilities values, which aren’t supported in visionOS: [arkit].
When I try and distribute a build to the AppStore in Xcode, it comes up with this message, I have not selected Vision Pro as part of my build, can anyone please help? Thanks!
Topic:
App Store Distribution & Marketing
SubTopic:
App Store Connect
Tags:
Vision
visionOS
iPad and iOS apps on visionOS
How do I test the new RecognizeDocumentRequest API. Reference: https://www.youtube.com/watch?v=H-GCNsXdKzM
I am running Xcode Beta, however I only have one primary device that I cannot install beta software on.
Please provide a strategy for testing. Will simulator work?
The new capability is critical to my application, just what I need for structuring document scans and extraction.
Thank you.
HI,
I've been modifying the Camera sample app found here: https://developer.apple.com/tutorials/sample-apps/capturingphotos-camerapreview ... in the processpreview images, I am calling in to the Vision APis to either detect a person or object, then I'm using the segmentation mask to extract the person and composite them onto a different background with some other filters. I am using coreimage to filter the CIImages, and converting and displaying as a SwiftUI Image. When running on my IPhone, it works fine. When running on my Iphone with the debugger, it crashes within a few seconds... Attached is a screenshot. At the top is an EXC_BAD_ACCESS in libRPAC.dylib`std::__1::__hash_table<std::__1::__hash_value_type<long, qos_info_t>, std::__1::__unordered_map_hasher<long, std::__1::__hash_value_type<long, qos_info_t>, std::__1::hash, std::__1::equal_to, true>, std::__1::__unordered_map_equal<long, std::__1::__hash_value_type<long, qos_info_t>, std::__1::equal_to, std::__1::hash, true>, std::__1::allocator<std::__1::__hash_value_type<long, qos_info_t>>>::__emplace_unique_key_args<long, std::__1::piecewise_construct_t const&, std::__1::tuple<long const&>, std::__1::tuple<>>:
This was working fine a couple of days ago.. Not sure why it's popping up now. Am I correct in interpreting this as an LLDB issue? How do I fix it?
Hi everyone,
I'm trying to use VNDetectTextRectanglesRequest to detect text rectangles in an image. Here's my current code:
guard let cgImage = image.cgImage(forProposedRect: nil, context: nil, hints: nil) else {
return
}
let textDetectionRequest = VNDetectTextRectanglesRequest { request, error in
if let error = error {
print("Text detection error: \(error)")
return
}
guard let observations = request.results as? [VNTextObservation] else {
print("No text rectangles detected.")
return
}
print("Detected \(observations.count) text rectangles.")
for observation in observations {
print(observation.boundingBox)
}
}
textDetectionRequest.revision = VNDetectTextRectanglesRequestRevision1
textDetectionRequest.reportCharacterBoxes = true
let handler = VNImageRequestHandler(cgImage: cgImage, orientation: .up, options: [:])
do {
try handler.perform([textDetectionRequest])
} catch {
print("Vision request error: \(error)")
}
The request completes without error, but no text rectangles are detected — the observations array is empty (count = 0). Here's a sample image I'm testing with:
I expected VNTextObservation results, but I'm not getting any. Is there something I'm missing in how this API works? Or could it be a limitation of this request or revision?
Thanks for any help!
I encountered some issues while developing a Vision Pro program using Unity. After binding an ARAnchor to a game object, I overlapped the virtual game object with a real-world cup. However, when I moved around with the Vision Pro on, the virtual game object shifted, causing the real-world cup and the virtual object to no longer coincide. Is there a way to solve this?
Hi,
I am modifying the sample camera app that is here: https://developer.apple.com/tutorials/sample-apps/capturingphotos-camerapreview ... In the processPreviewImages, I am using the Vision APIs to generate a segmentation mask for a person/object, then compositing that person onto a different background (with some other filtering). The filtering and compositing is done via CoreImage. At the end, I convert the CIImage to a CGImage then to a SwiftUI Image. When I run it on my iPhone, it works fine, and has not crashed. When I run it on the iPhone with the debugger, it crashes within a few seconds with:
EXC_BAD_ACCESS in libRPAC.dylib`std::__1::__hash_table<std::__1::__hash_value_type<long, qos_info_t>, std::__1::__unordered_map_hasher<long, std::__1::__hash_value_type<long, qos_info_t>, std::__1::hash, std::__1::equal_to, true>, std::__1::__unordered_map_equal<long, std::__1::__hash_value_type<long, qos_info_t>, std::__1::equal_to, std::__1::hash, true>, std::__1::allocator<std::__1::__hash_value_type<long, qos_info_t>>>::__emplace_unique_key_args<long, std::__1::piecewise_construct_t const&, std::__1::tuple<long const&>, std::__1::tuple<>>:
It had previously been working fine with the debugger, so I'm not sure what has changed. Is there a difference in how the Vision APIs are executed if the debugger is attached vs. not?
Hi,
One can configure the languages of a (VN)RecognizeTextRequest with either:
.automatic: language to be detected
a specific language, say Spanish
If the request is configured with .automatic and successfully detects Spanish, will the results be exactly equivalent compared to a request made with Spanish set as language?
I could not find any information about this, and this is very important for the core architecture of my app.
Thanks!
Hi everyone,
I'm using the Vision framework’s ImageAestheticsScoresObservation class (https://developer.apple.com/documentation/vision/imageaestheticsscoresobservation).
I noticed that the overallScore returned sometimes gives negative values. Could someone confirm whether the expected range of the score is from -1.0 to 1.0?
The documentation doesn’t explicitly state the possible score range, so I’d appreciate any clarification or insights.
Thanks in advance!
Hello,
I am experimenting with Unity to develop a mixed reality (MR) application for visionOS. I would like to understand the best approach for structuring my project:
Should I build the entire experience in Unity (both Windows and Volumes)?
Or is it better to create only certain elements (e.g., Volumes) in Unity while managing Windows separately in Xcode?
Also, how well do interactions (e.g pinch, grab…) created in Unity integrate with Xcode?
If I use the PolySpatial plugin, does that allow me to manage all interactions entirely within Unity, or would I still need to handle/integrate part of it in Xcode?
What's worked best for you? Please let me know if you have any recommendations, Thanks!
Topic:
Spatial Computing
SubTopic:
General
Tags:
Vision
Reality Composer Pro
visionOS
iPad and iOS apps on visionOS
Hi!
I attempted to run a sample project for detecting human pose in photos, which can be found here:
https://developer.apple.com/documentation/vision/detecting-human-body-poses-in-3d-with-vision
The project works perfectly when run on my Macbook Pro M1, but it fails on Apple Vision Pro. After selecting the photo an endless loading screen is presented and the following output is produced in the console:
Failed to initialize 2D Detection Algorithm.
Failed to initialize 2D Pose Estimation Algorithm.
Failed to initialize algorithm modules
Network path is nil: (null)
Failed to initialize 2D Detection Algorithm.
Failed to initialize 2D Pose Estimation Algorithm.
Failed to initialize algorithm modules
Unable to perform the request: Error Domain=com.apple.Vision Code=9 "Async status object reported as failed but without an error" UserInfo={NSLocalizedDescription=Async status object reported as failed but without an error}.
de-activating session 70138 after timeout
It seems that VNDetectHumanBodyPose3DRequest is failing on Vision Pro for some reason. Are there any additional requirements for running Vision framework on VisionOS, that I might be missing?
as in the environments we have real tiem reflections of movies on a screen or reflections of the surrounding hood in the background...
could i get a metallic surface getting accurate reflections of a box on top ?
i don't mean getting a rpobe or hdr cubemap, i mean the same accurate reflections as the water of the mt hood with movie i'm wacthing in other app
Hi!
I attempted running a sample project for detecting human pose in 3D with vision framework, that can be found here: https://developer.apple.com/documentation/vision/detecting-human-body-poses-in-3d-with-vision.
It works perfectly on my Macbook Pro M1, but fails on Apple Vision Pro. After selecting a photo, an endless loading screen is displayed and the following message is produced in the console:
Failed to initialize 2D Detection Algorithm.
Failed to initialize 2D Pose Estimation Algorithm.
Failed to initialize algorithm modules
Network path is nil: (null)
Failed to initialize 2D Detection Algorithm.
Failed to initialize 2D Pose Estimation Algorithm.
Failed to initialize algorithm modules
Unable to perform the request: Error Domain=com.apple.Vision Code=9 "Async status object reported as failed but without an error" UserInfo={NSLocalizedDescription=Async status object reported as failed but without an error}.
de-activating session 70138 after timeout
Is human pose detection expected to work on VisionOS? Is there any special configuration required, that I might be missing?
The goal is to achieve precise joint tracking for clinical assessment. The Doctor is wearing the AVP and observing the Patients movement.
Do you have any recommended best practices for integrating real-time joint tracking and displaying them on the patient within visionOS?
We attempted to use VNHumanBodyPose3DObservation, which theoretically should work, but we are unable to display the detected joints in an Immersive Space for real-time validation. This makes it difficult for the doctor to ensure accurate tracking and if possible a photo or video of the Range of Motion assessment would be needed for the patient record.
Are there alternative methods to achieve precise real-time joint tracking without requiring main camera access (com.apple.developer.arkit.main-camera-access.allow)?
Our app is downloading a zip of an .mlpackage file, which is then compiled into an .mlmodelc file using MLModel.compileModel(at:). This model is then run using a VNCoreMLRequest.
Two users – and this after a very small rollout - are reporting issues running the VNCoreMLRequest. The error message from their logs:
Error Domain=com.apple.CoreML Code=0 "Failed to build the model execution plan using a model architecture file '/private/var/mobile/Containers/Data/Application/F93077A5-5508-4970-92A6-03A835E3291D/Documents/SKDownload/Identify-image-iOS/mobile_img_eu_v210.mlmodelc/model.mil' with error code: -5."
The URL there is to a file inside the compiled model. The error is happening when the perform function of VNImageRequestHandler is run. (i.e. the model compiled without an error.)
Anyone else seen this issue? Its only picked up in a few web results and none of them are directly relevant or have a fix.
I know that a CoreML error Code=0 is a generic error, but does anyone know what error code -5 is? Not even sure which framework its coming from.
I’m working on a Vision Pro app using Metal and need to implement multi-pass rendering. Specifically, I want to render intermediate results to a texture, then use that texture in a second pass for post-processing before presenting the final output.
What’s the best approach in visionOS? Should I use multiple render passes in a single command buffer or separate command buffers? Any insights on efficiently handling this in RealityKit or Metal?
Thanks!
Keep getting error :
I have tried Picker for File, Photo Library , both same results .
Debugging the resize for 360x360 but still facing this error.
The model I'm trying to implement is created with CreateMLComponents
The process is from example of WWDC 2022 Banana Ripeness , I have used index for each .jpg .
Prediction Failed: The VNCoreMLTransform request failed
Is there some possible way to solve it or is error somewhere in training of model ?