Error when capturing a high-resolution frame with depth data enabled in ARKit

  1. Problem Description

(1) I am using ARKit in an iOS app to provide AR capabilities. Specifically, I'm trying to use the ARSession's captureHighResolutionFrame(using:) method to capture a high-resolution frame along with its corresponding depth data:

open func captureHighResolutionFrame(using photoSettings: AVCapturePhotoSettings?) async throws -> ARFrame

(2) However, when I attempt to do so, the call fails at runtime with the following error, which I captured from the Xcode debugger:

[AVCapturePhotoOutput capturePhotoWithSettings:delegate:] settings.depthDataDeliveryEnabled must be NO if self.isDepthDataDeliveryEnabled is NO

  1. Code Snippet Explanation

(1) ARConfig and ARSession Initialization

The following code configures the ARConfiguration and ARSession. A key part of this setup is setting the videoFormat to the one recommended for high-resolution frame capturing, as suggested by the documentation.

func start(imagesDirectory: URL, configuration: Configuration = Configuration()) {
    // ... basic setup ...
    
    let arConfig = ARWorldTrackingConfiguration()
    arConfig.planeDetection = [.horizontal, .vertical]
    
    // Enable various frame semantics for depth and segmentation
    if ARWorldTrackingConfiguration.supportsFrameSemantics(.smoothedSceneDepth) {
        arConfig.frameSemantics.insert(.smoothedSceneDepth)
    }
    if ARWorldTrackingConfiguration.supportsFrameSemantics(.sceneDepth) {
        arConfig.frameSemantics.insert(.sceneDepth)
    }
    if ARWorldTrackingConfiguration.supportsFrameSemantics(.personSegmentationWithDepth) {
        arConfig.frameSemantics.insert(.personSegmentationWithDepth)
    }
    
    // Set the recommended video format for high-resolution captures
    if let videoFormat = ARWorldTrackingConfiguration.recommendedVideoFormatForHighResolutionFrameCapturing {
       arConfig.videoFormat = videoFormat
       print("Enabled: High-Resolution Frame Capturing by selecting recommended video format.")
    }
    
    arSession.run(arConfig, options: [.resetTracking, .removeExistingAnchors])
    // ...
}

(2) Capturing the High-Resolution Frame

The code below is intended to manually trigger the capture of a high-resolution frame. The goal is to obtain both a high-resolution color image and its associated high-resolution depth data. To achieve this, I explicitly set the isDepthDataDeliveryEnabled property of the AVCapturePhotoSettings object to true.

func requestImageCapture() async {
    // ... guard statements ...
        
    print("Manual image capture requested.")
       
    if #available(iOS 16.0, *) { // Assuming 16.0+ for this API
        if let defaultSettings = arSession.configuration?.videoFormat.defaultPhotoSettings {
            // Create a mutable copy from the default settings, as recommended
            let photoSettings = AVCapturePhotoSettings(from: defaultSettings)
            // Explicitly enable depth data delivery for this capture request
            photoSettings.isDepthDataDeliveryEnabled = true
            
            do {
                let highResFrame = try await arSession.captureHighResolutionFrame(using: photoSettings)
                print("Successfully captured a high-resolution frame.")
                if let initialDepthData = highResFrame.capturedDepthData {
                    // Process depth data...
                } else {
                    print("High-resolution frame was captured, but it contains no depth data.")
                }
            } catch {
                // The exception is caught here
                print("Error capturing high-resolution frame: \(error.localizedDescription)")
            }
        }
    }
    // ...
}
  1. Issue Confirmation & Question

(1) Through debugging, I have confirmed the following behavior: If I call captureHighResolutionFrame without providing the photoSettings parameter, or if photoSettings.isDepthDataDeliveryEnabled is set to false, the method successfully returns a high-resolution ARFrame, but its capturedDepthData is nil.

(2) The error message clearly indicates that settings.depthDataDeliveryEnabled can only be true if the underlying AVCapturePhotoOutput instance's own isDepthDataDeliveryEnabled property is also true.

(3) However, within the context of ARKit and ARSession, I cannot find any public API that would allow me to explicitly access and configure the underlying AVCapturePhotoOutput instance that ARSession manages.

(4) My question is: Is there a way to configure the ARSession's internal AVCapturePhotoOutput to enable its isDepthDataDeliveryEnabled property? Or, is simultaneously capturing a high-resolution frame and its associated depth data simply not a supported use case in the current ARKit framework?

Error when capturing a high-resolution frame with depth data enabled in ARKit
 
 
Q