Hi there,
I received an enterprise license file to include enhanced object tracking configuration for the Vision Pro. My account is part of the team which got the allowance from Apple to use this capability. Unfortunately, although I followed the guide, I do not find the Object Tracking capability when I try to add it to my project. There are other capabilities like Main Camera on the Vision Pro, but not for Object Tracking. I am using Xcode 26.1 and visionOS 26.1. What am I missing here?
Thanks in advance,
Matthias
ARKit
RSS for tagIntegrate iOS device camera and motion features to produce augmented reality experiences in your app or game using ARKit.
Posts under ARKit tag
164 Posts
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
Environment
visionOS 26.1, Xcode 26.1.1
Problem
When a WindowGroup opens an ImmersiveSpace and the user closes the window via X button, the async Task in .onDisappear gets cancelled before dismissImmersiveSpace() completes, leaving the ImmersiveSpace active with no way to exit.
Steps
WindowGroup opens ImmersiveSpace in .onAppear
User clicks X to close window
.onDisappear fires but async cleanup cancelled
ImmersiveSpace remains active, user trapped
Expected
ImmersiveSpace dismissed when window closes
Actual
ImmersiveSpace remains active
Code
.onAppear {
Task {
await openImmersiveSpace(id: "VideoCallMainCamera")
}
}
.onDisappear {
Task {
await dismissImmersiveSpace() // Gets cancelled
}
}
What I've Tried
Task in .onDisappear ❌
scenePhase monitoring ❌
High priority Task ❌
.restorationBehavior(.disabled) + .defaultLaunchBehavior(.suppressed) ✅ (prevents restoration but doesn't fix immediate cleanup)
Question
What's the recommended pattern for ensuring ImmersiveSpace cleanup when WindowGroup closes? Is there a way to block window closure until async cleanup completes, or should ImmersiveSpaces automatically dismiss with their parent window?
Hi,
We’ve been successfully using the RoomPlan API in our application for over two years. Recently, however, users have reported encountering persistent capture errors during their sessions. Specifically, the errors observed are:
CaptureError.worldTrackingFailure
CaptureError.exceedSceneSizeLimit
What we have observed:
Persistent Errors: The errors continue to occur even after initiating new capture sessions.
Normal Usage: Our implementation adheres to typical usage patterns of the RoomPlan API without exceeding any documented room size limits.
Limited Feature Usage: We are not utilizing the WorldTracking feature for the StructureBuilder functionality to stitch rooms together.
Potential State Caching: Given that these errors persist across sessions, we suspect that there might be memory or state cached between sessions that is not being cleared, particularly since we are not taking advantage of StructureBuilder.
Request:
Could you please advise if there is any internal caching or memory retention between capture sessions that might lead to these errors? Additionally, we would appreciate guidance on how to clear or manage this state when the StructureBuilder feature is not in use.
Here is a generalised version of our capture session initialization code to help diagnose the issue.
struct RoomARCaptureView: UIViewRepresentable {
typealias Handler = (CapturedRoom, Error?) -> Void
@Binding var stop: Bool
@Binding var done: Bool
let completion: Handler?
func makeUIView(context: Self.Context) -> RoomCaptureView {
let view = RoomCaptureView(frame: .zero)
view.delegate = context.coordinator
view.captureSession.run(configuration: .init())
return view
}
func updateUIView(_ uiView: RoomCaptureView, context: Self.Context) {
if stop {
// Stop the session only once, multiple times causes issues with the final presentation
uiView.captureSession.stop()
stop = false
done = true
}
}
static func dismantleUIView(_ uiView: RoomCaptureView, coordinator: Self.Coordinator) {
uiView.captureSession.stop()
}
func makeCoordinator() -> ARViewCoordinator {
ARViewCoordinator(completion)
}
@objc(ARViewCoordinator)
class ARViewCoordinator: NSObject, RoomCaptureViewDelegate {
var completion: Handler?
public required init?(coder: NSCoder) {}
public func encode(with coder: NSCoder) {}
public init(_ completion: Handler?) {
super.init()
self.completion = completion
}
public func captureView(shouldPresent roomDataForProcessing: CapturedRoomData, error: (Error)?) -> Bool {
return true
}
public func captureView(didPresent processedResult: CapturedRoom, error: (Error)?) {
completion?(processedResult, error)
}
}
}
Thank you for your assistance.
What is the reason the hand-tracking joints have these axes? I'm trying to create a virtual hands model and that's a mess.
The simplest realityView (content, attachments in ...
causes Contextual closure expects 1 argument but 2 were used in closure body. I have checked every example and i cannot understand why i get this error regardless of any content. Note: i have added Attachment(id: "test") to the attachment closure and get Attachment not is scope.
imported both realityKit and SwiftUI.
I am trying the simplest use of attachment in realityKit and get Contextual closure type @MainActor, @Sendable (inout RealityViewCameraContent) async -> void expects 1 argument, but 2 were used in closure body.
Also i get cannot find Attachment in scope
I have an app on the App Store for many years enabling users to post text into clouds in augmented reality. Yet last week abruptly upon installing the app on the iPhone the screen started going totally dark and a list of little comprehensible logs came up of the kind:
ARSCNCompositor <0x300ad0e00>: ARSCNCompositor (0, 0) initialization failed. Matting is not set up properly.
many times, then
RWorldTrackingTechnique <0x106235180>: Unable to update pose [PredictorFailure] for timestamp 870.392108
ARWorldTrackingTechnique <0x106235180>: Unable to predict pose [1] for timestamp 870.392108
again several times and then:
ARWorldTrackingTechnique <0x106235180>: SLAM error callback: Error Domain=Slam Error Code=7 "Non fatal error occurred due to significant drop in a IMU data" UserInfo={NSDescription=Non fatal error occurred due to significant drop in a IMU data, NSLocalizedFailureReason=SlamEngineNodeGroup Failure: IMU issue: gyro data stream verification failed [Significant data drop]. Failed on timestamp: 870.413247, Last known timestamp: 865.350198, Delta: 5.063049, System timestamp: 870.415781, Delta between system and frame: 0.002534. }
and then again the pose issues several times.
I hoped the new beta version would have solved the issue, but it was not the case. Unfortunately I do not know if that depends on the beta version or some other issue, given the app may be not installed on the Mac simulator.
Hi everyone,
I’ve been analyzing the current state of Sign Language accessibility tools, and I noticed a significant gap in learning tools: we lack real-time feedback for students (e.g., "Is my hand position correct?").
Most current solutions rely on 2D video processing, which struggles with depth perception and occlusion (hand-over-hand or hand-over-face gestures), which are critical in Sign Language grammar.
I'd like to propose/discuss an architecture leveraging the current LiDAR + Neural Engine capabilities found in iPhone devices to solve this.
The Concept: Skeleton-based Normalization
Instead of training ML models on raw video frames (which introduces noise from lighting, skin tone, and clothing), we could use ARKit's Body Tracking to abstract the input.
Capture: Use ARKit/LiDAR to track the user's upper body and hand joints in 3D space.
Data Normalization: Extract only the vector coordinates (X, Y, Z of joints). This creates a "clean" dataset, effectively normalizing the user regardless of physical appearance.
Comparison: Feed these vectors into a CoreML model trained on "Reference Skeletons" (recorded by native signers).
Feedback Loop: The app calculates the geometric distance between the user's pose and the reference pose to provide specific correction (e.g., "Raise your elbow 10 degrees").
Why this approach?
Solves Occlusion: LiDAR handles depth much better than standard RGB cameras when hands cross the body.
Privacy: We are processing coordinates, not video streams.
Efficiency: Comparing vector sequences is computationally cheaper than video analysis, preserving battery life.
Has anyone experimented with using ARKit Body Anchors specifically for comparing complex gesture sequences against a stored "correct" database? I believe this "Skeleton First" approach is the key to scalable Sign Language education apps.
Looking forward to hearing your thoughts.
I'm capturing a room via RoomPlan API and would like to access the DepthMap(sceneDepth) or SmoothDepthMap(smoothedSceneDepth) from my own provided ARSession for RoomCaptureSession.
But both depth maps are empty when handling the delegates. I have not found a solution yet. So is it even possible? Because i have not found any documentation of what RoomCaptureSession overwrites in the ARSession if I provide my own ARSession instance.
Here is a example code snippet of what i'm trying to do:
private let arSession = ARSession()
private lazy var roomPlanCaptureSession = RoomCaptureSession(arSession: arSession)
let arConfig = ARWorldTrackingConfiguration()
//Create semantics for ARconfig which is used for ARSession
var semantics: ARWorldTrackingConfiguration.FrameSemantics = []
if ARWorldTrackingConfiguration.supportsFrameSemantics(.sceneDepth) {
semantics.insert(.sceneDepth)
}
if ARWorldTrackingConfiguration.supportsFrameSemantics(.smoothedSceneDepth) {
semantics.insert(.smoothedSceneDepth)
}
arConfig.frameSemantics = semantics
//set delegates
roomPlanCaptureSession.delegate = self
arSession.delegate = self
//Check if device support for depthMap
if ARWorldTrackingConfiguration.supportsFrameSemantics(.sceneDepth){
arSession.run(arConfig)
}
else{
print(".sceneDepth is unsupported.")
}
//run roomcapture scan config
let captureConfig = RoomCaptureSession.Configuration()
roomPlanCaptureSession.run(configuration: captureConfig)
//trying to get sceneDepth
public func session(_ session: ARSession, didUpdate frame: ARFrame) {
print("session delegate capture: sceneDepth: \(String(describing: frame.sceneDepth))")
//prints: session delegate capture: sceneDepth: nil
also in this video from 2023 it is say that i can pass custom ARSession to my RoomPlan.
Explore enhancements to RoomPlan - Video
Quote 3:00: Here is the init and stop function in previous RoomPlan. And here is how you pass over a custom ARSession to init function. Any custom ARSession with ARWorldTrackingConfiguration will be honored inside RoomCaptureSession.
anyway I welcome any input. maybe im doing something wrong. :)
Since updating to iOS 26.0 (and confirmed on 26.1), ARBodyTrackingConfiguration no longer detects a valid ARBodyAnchor on devices with LiDAR (e.g., iPhone 15 Pro, iPhone 17 Pro Max).
This issue reproduces in custom projects and Apple’s official sample “Capturing Body Motion in 3D”.
The AR session runs normally, but the delegate call:
func session(_ session: ARSession, didUpdate anchors: [ARAnchor])
never yields an ARBodyAnchor with valid joint transforms.
All joints return nil when calling:
body.skeleton.modelTransform(for: jointName)
resulting in 0 valid joints per frame.
Environment
• Device: iPhone 17 Pro Max (LiDAR)
• iOS: 26.0 / 26.1
• Xcode: 16.0 (stable)
• Framework: ARKit + RealityKit
• Configuration used:
config.worldAlignment = .gravityAndHeading
config.isAutoFocusEnabled = true
config.environmentTexturing = .none
session.run(config)
Also tested: with and without frameSemantics = .bodyDetection
Expected Behavior
ARBodyAnchor should be detected and body.skeleton should contain ~89 valid joints with continuous updates.
This is using:
**Unity 6000.2.7f - latest ARKit etc
Xcode 26.1 **
This is going to iPad/iPhone/iOS and I have succesfully built to the iPad before, but updates to the software seem to have added these errors in.
Basically all updated as much as possible.
Originally I got 5 errors, but updates brought it down.
On build I now get 3 undefined symbols:
Undefined symbol: _swift_FORCE_LOAD$_swiftCompatibility51
Undefined symbol: _swift_FORCE_LOAD$_swiftCompatibility56
Undefined symbol: _swift_FORCE_LOAD$_swiftCompatibilityConcurrency
This appears to be a bug, or issue, that is in some way known, but I'm not sure how best to get past it.
ld: warning: search path '/var/run/com.apple.security.cryptexd/mnt/com.apple.MobileAsset.MetalToolchain-v17.1.324.0.kGuqPt/Metal.xctoolchain/usr/lib/swift/iphoneos' not found
ld: warning: search path '/var/run/com.apple.security.cryptexd/mnt/com.apple.MobileAsset.MetalToolchain-v17.1.324.0.kGuqPt/Metal.xctoolchain/usr/lib/swift-5.0/iphoneos' not found
ld: warning: Could not find or use auto-linked library 'swiftCompatibility51': library 'swiftCompatibility51' not found
ld: warning: Could not find or use auto-linked library 'swiftCompatibility56': library 'swiftCompatibility56' not found
ld: warning: Could not find or use auto-linked library 'swiftCompatibilityConcurrency': library 'swiftCompatibilityConcurrency' not found
ld: warning: Could not find or use auto-linked library 'swiftCompatibilityPacks': library 'swiftCompatibilityPacks' not found
ld: warning: Could not find or use auto-linked framework 'CoreAudioTypes': framework 'CoreAudioTypes' not found
ld: warning: Could not find or use auto-linked framework 'UIUtilities': framework 'UIUtilities' not found
ld: warning: Could not parse or use implicit file '/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS.sdk/System/Library/Frameworks/SwiftUICore.framework/SwiftUICore.tbd': cannot link directly with 'SwiftUICore' because product being built is not an allowed client of it
Undefined symbols for architecture arm64:
"_swift_FORCE_LOAD$_swiftCompatibility51", referenced from:
_swift_FORCE_LOAD$swiftCompatibility51$_UnityARKit in libUnityARKit.a[49](RoomCaptureSessionWrapper. o)
"_swift_FORCE_LOAD$_swiftCompatibility56", referenced from:
_swift_FORCE_LOAD$swiftCompatibility56$_UnityARKit in libUnityARKit.a[49](RoomCaptureSessionWrapper. o)
"_swift_FORCE_LOAD$_swiftCompatibilityConcurrency", referenced from:
_swift_FORCE_LOAD$swiftCompatibilityConcurrency$_UnityARKit in libUnityARKit.a[49](RoomCaptureSessionWrapper. o)
ld: symbol(s) not found for architecture arm64
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
Best approach for high-quality textured room reconstruction using ARKit / RoomPlan / Object Capture?
I am developing an IOS App that allow users to scan rooms, view the scans on device, and add notes. I need to preserve actual geometry (odd angles, chamfers, fixtures), not simplified RoomPlan boxes.
Are there any easy ways to incorporate high quality texture mapping or PBR? Where is the documentation for scene reconstruction?
Hi team,
I believe I’ve found a registration issue between ARFrame.sceneDepth and ARFrame.capturedImage when using high-resolution frame capture on a 2022 iPad Pro (6th gen).
When enabling high-resolution capture:
if let highResFormat = ARWorldTrackingConfiguration.recommendedVideoFormatForHighResolutionFrameCapturing {
config.videoFormat = highResFormat
}
…
arView.session.captureHighResolutionFrame { ... }
the depth map provided by ARFrame.sceneDepth no longer aligns correctly with the corresponding high-resolution capturedImage.
This misalignment results in consistently over-estimated distance measurements in my app (which relies on mapping depth to 2D pixel coordinates).
iPad Pro (6th gen): misalignment occurs only when capturing high-resolution frames.
iPhone 16 Pro: depth is correctly registered for both standard and high-resolution captures.
It appears the camera intrinsics, specifically the FOV, change between the “regular” resolution stream and the high-resolution capture on the iPad. My suspicion is that the depth data continues using the intrinsics of the lower resolution stream, resulting in an unregistered depth-to-RGB mapping.
Once I have the iPad in hand again, I will confirm whether camera.intrinsics or FOV differ between the low-res and high-res frames.
Is this a known issue with high-resolution frame capture on the 2022 iPad Pro? If not, I’m happy to provide some more thorough sample code.
Thanks for your time!
I'm developing a custom gesture-based visionOS project that uses hand tracking with collision detection spheres on fingers to register user interactions through collision components. I'm experiencing a critical occlusion issue where collision detection spheres are intermittently occluded by the background/depth buffer, causing fingers to pass through the 3D model entities without registering interactions.
Detailed Description:
I have added 3D entities in an immersive scene with collision spheres attached to fingers for detecting user interactions.
Each sphere has:
CollisionComponent with sphere shape
Proper collision masks and groups configured
Real-time position updates from hand joint transforms
Each entity has:
InputTarget components to register collisions
The Issue:
When users move their fingers to the entity to interact, some collision spheres (particularly on the pinkie and ring fingers) become occluded and pass directly through the 3D model without triggering collision events.
Meanwhile, other fingers (like the index finger) continue to work correctly.
This appears to be a depth perception/z-buffer issue between the model entity and the hand tracking collision spheres
Questions:
Is there a recommended approach for maintaining consistent depth ordering between hand-tracking entities and 3D models in immersive spaces to prevent occlusion issues?
Should I be using AnchorEntities to anchor the entity to a plane or world position to establish a more stable depth reference?
Are there specific RenderingComponent or material settings that could help ensure collision entities maintain their depth priority and don't get occluded?
Could this be related to z-fighting when collision spheres and entity geometry occupy similar depth ranges? If so, what's the recommended depth bias approach?
Is there a better architectural approach for implementing interactions with custom hand gesture tracking that avoids these depth perception issues?
What Would Help:
Implementation guidance for ensuring reliable collision detection between hand-tracked entities through custom gestures and 3D models.
Best practices for depth management in immersive spaces with custom hand gesture tracking.
Sample code demonstrating stable hand-to-object interaction patterns.
Information about whether this is a known limitation or if there are specific APIs I should be leveraging
This issue is significantly impacting the reliability of our app experience, as users cannot consistently interact with all model components. Any guidance from Apple engineers or developers who have solved similar depth/occlusion challenges would be greatly appreciated.
Additional Context:
This is for a productivity-focused application where accuracy and reliability are critical.
Thank you for any assistance!
Topic:
Spatial Computing
SubTopic:
Reality Composer Pro
Tags:
ARKit
Reality Composer
AR / VR
visionOS
I use ARKit for motion tracking. I get the skeleton joint coordinates and use them for animation. I didn't make any changes to the code, but I updated the iOS version from 18 to 26, and modelTransform now always returns nil.
https://developer.apple.com/documentation/arkit/arskeleton3d/modeltransform(for:)
For example
bodyAnchor.skeleton.modelTransform(for: .init(rawValue: "head_joint"))
bodyAnchor is ARBodyAnchor.
I see the default skeleton on the screen, but now I can't get the coordinates out of it.
I'm using an example from Apple's WWDC presentation.
https://developer.apple.com/documentation/arkit/capturing-body-motion-in-3d
Are there any changes in the API? Or just bug?
On iOS 26.1, this throws on the 2020 iPad Pro (4th gen) but works fine on an M4 iPad Pro or iPhone 15 Pro:
guard let device = AVCaptureDevice.default(.builtInLiDARDepthCamera, for: .video, position: .back) else {
throw ConfigurationError.lidarDeviceUnavailable
}
It's just the standard code from Apple's own sample code so obviously used to work:
https://developer.apple.com/documentation/AVFoundation/capturing-depth-using-the-lidar-camera
Does it fail because Apple have silently dumped support for the older LiDAR sensor used prior to the M4 iPad Pro, or is there another reason? What about the 5th and 6th gen iPad Pro, does it still work on those?
Hi everyone! I am working on AR app and wanted to implement object occlusion because it removes drift pretty much from the object. This working great with RealityKit sample But I am unable to replicate such behaviour it with scenekit. Because scenekit does not offer object occlusion. Can we say scenekit is getting depricated, and we should re-write app in RealityKit (which is obviously a big task)?
I use ARKit's hand tracking to attach a 3D model of a remote control to the left hand. The user is supposed to press buttons on the remote control. In the Vision Pro settings, I have removed the left hand from Hands & Eye Tracking. Only the right hand is used. The problem now is that the left hand appears and the 3D model of the remote control fades out. I want the remote control to be completely visible. The user should feel like they really have the remote control in their hand. Can I prevent the fading out?
While using apple's vision pro, we noticed that we can continue to use the visionOS keyboard when we no longer actually see it in passthrough.
In other words, when we focus on a field to type, visionOS displays the keyboard for us in such a way that we actually see it. Then, we noticed if we look away a little bit, either up, or down, or left, or right, in such a way that the keyboard is no longer visible by us in the passthrough, the keyboard still remains responsive to taps from our fingers at the location where it is. It seems the keyboard remains functional and responsive to taps even though we can no longer observe/see it.
We are trying to figure out how to implement similar functionality in our app whereby the user can continue to manipulate a 3d entity when the user can no longer actually observe it in passthrough (like the visionOS keyboard appears to allow).
I assume the visionOS keyboard has this functionality thanks to the downward facing sensors on the hardware that allow hand tracking even though the hands can no longer be observed by the user. That is likely how we can rest our hands on our lap is still be able to interact with visionOS.
How can we implement a similar functionality for 3D entities?
Is there a way to tap in, or to allow hand tracking, from those toward facing cameras?
Is it possible to manipulate a 3D entity when it is no longer observed by the user for example when they shift their attention somewhere else in the field of vision?
How does the visionOS keyboard achieve this?
Problem Description
(1) I am using ARKit in an iOS app to provide AR capabilities. Specifically, I'm trying to use the ARSession's captureHighResolutionFrame(using:) method to capture a high-resolution frame along with its corresponding depth data:
open func captureHighResolutionFrame(using photoSettings: AVCapturePhotoSettings?) async throws -> ARFrame
(2) However, when I attempt to do so, the call fails at runtime with the following error, which I captured from the Xcode debugger:
[AVCapturePhotoOutput capturePhotoWithSettings:delegate:] settings.depthDataDeliveryEnabled must be NO if self.isDepthDataDeliveryEnabled is NO
Code Snippet Explanation
(1) ARConfig and ARSession Initialization
The following code configures the ARConfiguration and ARSession. A key part of this setup is setting the videoFormat to the one recommended for high-resolution frame capturing, as suggested by the documentation.
func start(imagesDirectory: URL, configuration: Configuration = Configuration()) {
// ... basic setup ...
let arConfig = ARWorldTrackingConfiguration()
arConfig.planeDetection = [.horizontal, .vertical]
// Enable various frame semantics for depth and segmentation
if ARWorldTrackingConfiguration.supportsFrameSemantics(.smoothedSceneDepth) {
arConfig.frameSemantics.insert(.smoothedSceneDepth)
}
if ARWorldTrackingConfiguration.supportsFrameSemantics(.sceneDepth) {
arConfig.frameSemantics.insert(.sceneDepth)
}
if ARWorldTrackingConfiguration.supportsFrameSemantics(.personSegmentationWithDepth) {
arConfig.frameSemantics.insert(.personSegmentationWithDepth)
}
// Set the recommended video format for high-resolution captures
if let videoFormat = ARWorldTrackingConfiguration.recommendedVideoFormatForHighResolutionFrameCapturing {
arConfig.videoFormat = videoFormat
print("Enabled: High-Resolution Frame Capturing by selecting recommended video format.")
}
arSession.run(arConfig, options: [.resetTracking, .removeExistingAnchors])
// ...
}
(2) Capturing the High-Resolution Frame
The code below is intended to manually trigger the capture of a high-resolution frame. The goal is to obtain both a high-resolution color image and its associated high-resolution depth data. To achieve this, I explicitly set the isDepthDataDeliveryEnabled property of the AVCapturePhotoSettings object to true.
func requestImageCapture() async {
// ... guard statements ...
print("Manual image capture requested.")
if #available(iOS 16.0, *) { // Assuming 16.0+ for this API
if let defaultSettings = arSession.configuration?.videoFormat.defaultPhotoSettings {
// Create a mutable copy from the default settings, as recommended
let photoSettings = AVCapturePhotoSettings(from: defaultSettings)
// Explicitly enable depth data delivery for this capture request
photoSettings.isDepthDataDeliveryEnabled = true
do {
let highResFrame = try await arSession.captureHighResolutionFrame(using: photoSettings)
print("Successfully captured a high-resolution frame.")
if let initialDepthData = highResFrame.capturedDepthData {
// Process depth data...
} else {
print("High-resolution frame was captured, but it contains no depth data.")
}
} catch {
// The exception is caught here
print("Error capturing high-resolution frame: \(error.localizedDescription)")
}
}
}
// ...
}
Issue Confirmation & Question
(1) Through debugging, I have confirmed the following behavior: If I call captureHighResolutionFrame without providing the photoSettings parameter, or if photoSettings.isDepthDataDeliveryEnabled is set to false, the method successfully returns a high-resolution ARFrame, but its capturedDepthData is nil.
(2) The error message clearly indicates that settings.depthDataDeliveryEnabled can only be true if the underlying AVCapturePhotoOutput instance's own isDepthDataDeliveryEnabled property is also true.
(3) However, within the context of ARKit and ARSession, I cannot find any public API that would allow me to explicitly access and configure the underlying AVCapturePhotoOutput instance that ARSession manages.
(4) My question is:
Is there a way to configure the ARSession's internal AVCapturePhotoOutput to enable its isDepthDataDeliveryEnabled property? Or, is simultaneously capturing a high-resolution frame and its associated depth data simply not a supported use case in the current ARKit framework?