Integrate iOS device camera and motion features to produce augmented reality experiences in your app or game using ARKit.

Posts under ARKit tag

188 Posts

Post

Replies

Boosts

Views

Activity

Error when capturing a high-resolution frame with depth data enabled in ARKit
Problem Description (1) I am using ARKit in an iOS app to provide AR capabilities. Specifically, I'm trying to use the ARSession's captureHighResolutionFrame(using:) method to capture a high-resolution frame along with its corresponding depth data: open func captureHighResolutionFrame(using photoSettings: AVCapturePhotoSettings?) async throws -> ARFrame (2) However, when I attempt to do so, the call fails at runtime with the following error, which I captured from the Xcode debugger: [AVCapturePhotoOutput capturePhotoWithSettings:delegate:] settings.depthDataDeliveryEnabled must be NO if self.isDepthDataDeliveryEnabled is NO Code Snippet Explanation (1) ARConfig and ARSession Initialization The following code configures the ARConfiguration and ARSession. A key part of this setup is setting the videoFormat to the one recommended for high-resolution frame capturing, as suggested by the documentation. func start(imagesDirectory: URL, configuration: Configuration = Configuration()) { // ... basic setup ... let arConfig = ARWorldTrackingConfiguration() arConfig.planeDetection = [.horizontal, .vertical] // Enable various frame semantics for depth and segmentation if ARWorldTrackingConfiguration.supportsFrameSemantics(.smoothedSceneDepth) { arConfig.frameSemantics.insert(.smoothedSceneDepth) } if ARWorldTrackingConfiguration.supportsFrameSemantics(.sceneDepth) { arConfig.frameSemantics.insert(.sceneDepth) } if ARWorldTrackingConfiguration.supportsFrameSemantics(.personSegmentationWithDepth) { arConfig.frameSemantics.insert(.personSegmentationWithDepth) } // Set the recommended video format for high-resolution captures if let videoFormat = ARWorldTrackingConfiguration.recommendedVideoFormatForHighResolutionFrameCapturing { arConfig.videoFormat = videoFormat print("Enabled: High-Resolution Frame Capturing by selecting recommended video format.") } arSession.run(arConfig, options: [.resetTracking, .removeExistingAnchors]) // ... } (2) Capturing the High-Resolution Frame The code below is intended to manually trigger the capture of a high-resolution frame. The goal is to obtain both a high-resolution color image and its associated high-resolution depth data. To achieve this, I explicitly set the isDepthDataDeliveryEnabled property of the AVCapturePhotoSettings object to true. func requestImageCapture() async { // ... guard statements ... print("Manual image capture requested.") if #available(iOS 16.0, *) { // Assuming 16.0+ for this API if let defaultSettings = arSession.configuration?.videoFormat.defaultPhotoSettings { // Create a mutable copy from the default settings, as recommended let photoSettings = AVCapturePhotoSettings(from: defaultSettings) // Explicitly enable depth data delivery for this capture request photoSettings.isDepthDataDeliveryEnabled = true do { let highResFrame = try await arSession.captureHighResolutionFrame(using: photoSettings) print("Successfully captured a high-resolution frame.") if let initialDepthData = highResFrame.capturedDepthData { // Process depth data... } else { print("High-resolution frame was captured, but it contains no depth data.") } } catch { // The exception is caught here print("Error capturing high-resolution frame: \(error.localizedDescription)") } } } // ... } Issue Confirmation & Question (1) Through debugging, I have confirmed the following behavior: If I call captureHighResolutionFrame without providing the photoSettings parameter, or if photoSettings.isDepthDataDeliveryEnabled is set to false, the method successfully returns a high-resolution ARFrame, but its capturedDepthData is nil. (2) The error message clearly indicates that settings.depthDataDeliveryEnabled can only be true if the underlying AVCapturePhotoOutput instance's own isDepthDataDeliveryEnabled property is also true. (3) However, within the context of ARKit and ARSession, I cannot find any public API that would allow me to explicitly access and configure the underlying AVCapturePhotoOutput instance that ARSession manages. (4) My question is: Is there a way to configure the ARSession's internal AVCapturePhotoOutput to enable its isDepthDataDeliveryEnabled property? Or, is simultaneously capturing a high-resolution frame and its associated depth data simply not a supported use case in the current ARKit framework?
0
0
147
1d
Persistent Entity Position
I want to let users place 2D/3D “artworks” on detected walls and have them reappear in exactly the same real‑world spot after quitting and relaunching the app (like widgets do, but for my own entities).Environment: Xcode 26, visionOS 2.0, RealityKit + ARKitSession/WorldTrackingProvider Entities are parented to a holder that’s aligned to a wall via plane/mesh raycasts. What I’ve tried: Create a WorldAnchor at placement, save UUID + full 4×4 transform On next launch, re-create the WorldAnchor (or set the saved transform) and attach the entity Gate restore on relocalization/mesh updates and disable all raycast/search after restore Issue: After relaunch, placement still resolves relative to current device pose, not the same wall position. Questions: Is there a public API in visionOS 2.0 to persist app‑managed world anchors across sessions (room‑fixed), e.g., AnchorStore or equivalent? If not, what’s the recommended pattern to reliably restore wall‑anchored content? Are persistence features mentioned for widgets/windows available to third‑party RealityKit entities?
1
0
179
3d
ARKit: Keep USDZ node fixed after image tracking is lost (prevent drifting)
0 I’m using ARKit + SceneKit (Swift) with ARWorldTrackingConfiguration and detectionImages to place a 3D object (USDZ via SCNScene(named:)) when a reference image is detected. While the image is tracked, the object stays correctly aligned. Goal: When the tracked image is no longer visible, I want the placed node to remain visible and fixed at its last known pose (no drifting) as I move the camera. What works so far: Detect image → add node → track updates When the image disappears → keep showing the node at its last pose Problem: After the image is no longer tracked, the node drifts as I move the device/camera. It looks like it’s still influenced by the (now unreliable) image anchor or accumulating small world-tracking errors. Question: What’s the correct way in ARKit to “freeze” the node at its last known world transform once ARImageAnchor stops tracking, so it doesn’t drift?
2
0
427
6d
ARKit Body Tracking not detecting ARBodyAnchor on iOS 26.x (FB15128723)
Since updating to iOS 26.0 (and confirmed on 26.1), ARBodyTrackingConfiguration no longer detects a valid ARBodyAnchor on devices with LiDAR (e.g., iPhone 15 Pro, iPhone 17 Pro Max). This issue reproduces in custom projects and Apple’s official sample “Capturing Body Motion in 3D”. The AR session runs normally, but the delegate call: func session(_ session: ARSession, didUpdate anchors: [ARAnchor]) never yields an ARBodyAnchor with valid joint transforms. All joints return nil when calling: body.skeleton.modelTransform(for: jointName) resulting in 0 valid joints per frame. Environment • Device: iPhone 17 Pro Max (LiDAR) • iOS: 26.0 / 26.1 • Xcode: 16.0 (stable) • Framework: ARKit + RealityKit • Configuration used: config.worldAlignment = .gravityAndHeading config.isAutoFocusEnabled = true config.environmentTexturing = .none session.run(config) Also tested: with and without frameSemantics = .bodyDetection Expected Behavior ARBodyAnchor should be detected and body.skeleton should contain ~89 valid joints with continuous updates.
0
0
62
1w
Is ARGeoTrackingConfiguration always more accurate than ARWorldTrackingConfiguration for world scale AR?
We are working on a world scale AR app that leverages the device location and heading to place objects in the streets, so that they are correctly and stably anchored to certain locations. Since the geo-tracking imagery is only available in certain cities and areas, we are trying to figure out how to fallback when geo-tracking is not available as the device move away, to still retain good AR camera accuracy. We might need to come up with some algorithm using the device GPS, to line up the ARCamera with our objects. Question: Does geo-tracking always provide greater than or equal to the accuracy of world tracking, for a GPS outdoor AR experience? If so, we can simply use the ARGeoTrackingConfiguration for the entire time, and rely on the ARView keeping itself aligned. Otherwise, we need to switch between it and ARWorldTrackingConfiguration when geo-tracking is not available and/or its accuracy is low, then roll our own algorithm to keep the camera aligned. Thanks.
3
0
1k
1w
ARSkeleton3D modelTransform always return nil
I use ARKit for motion tracking. I get the skeleton joint coordinates and use them for animation. I didn't make any changes to the code, but I updated the iOS version from 18 to 26, and modelTransform now always returns nil. https://developer.apple.com/documentation/arkit/arskeleton3d/modeltransform(for:) For example bodyAnchor.skeleton.modelTransform(for: .init(rawValue: "head_joint")) bodyAnchor is ARBodyAnchor. I see the default skeleton on the screen, but now I can't get the coordinates out of it. I'm using an example from Apple's WWDC presentation. https://developer.apple.com/documentation/arkit/capturing-body-motion-in-3d Are there any changes in the API? Or just bug?
4
0
452
2w
RoomPlan CaptureError.exceedSceneSizeLimit on iOS devices
When scanning multiple rooms (10+) in a single structure using ARWorldMap for coordinate space consistency, RoomCaptureSession throws CaptureError.exceedSceneSizeLimit. The instructions here (https://developer.apple.com/documentation/roomplan/scanning-the-rooms-of-a-single-structure) provide exactly what I am doing to keep the underlying ARSession alive (by calling captureSession.stop(pause: false)) and save the results before a user moves to the next room. Scanning 11 or so rooms will cause the user to hit the exceedSceneSizeLimit error. The ARWorldMap is about 58 MB and always is around this size when hitting this issue. No anchors are present and all the data seems to be from tracking data. On iPad devices (where I do not see this issue) the ARWorldMap grows as a significantly slower rate in size. I save the ARWorldMap after each room is scanned and confirmed by the user. If I use the ARMap to initialize the ARSession (as described in the docs) the session will immediately error with "exceedSceneSizeLimit" once the captureSession.run() is executed. Occasionally it will allow me/the user to scan again, but either breaks mid scan or the following. This has been working fine for the past 2 years and users have been able to scan dozens of rooms without issue. It seems only lately that it has been a problem. I would expect the ARWorldMap to be allowed for much bigger sizes. At this point I can just about scan more area of my house with a single scan than I can when I use different captureSessions. Few observations: This happens on my iPhone 15 Pro Max, my iPhone 17 Pro, but not my iPad M4 (maybe memory related?). It is possible if scanning many more rooms it would happen on the iPad too. I have tried things such as resetting the ARConfig on the underlying ARSession to reset some, but this doesn't work. I have tried to create a new ARWorldMap and move the origin to the older map to clear out tracking data. This almost works but causes a mess of issues when a user moves at all due to the unshared coordinate space. I believe there are three active issues regarding this: FB14454922, FB15035788, FB20642944 Could we get an update for this issue? It is a production issue and severely limits my user experience in my production application.
0
0
58
2w
RealityKit - Full 3D experience
I have a question I guess more for the Apple team. But why are there no totally 3D experiences for the Vision Pro lineup? I know they have given us tools to implement unity 3D games into iPhone and I guess you can also build it in RealityKit. But why at this moment are 3D games limited to just iPad and iPhone and can't you bring that into Vision Pro? Just to explain. When I say a totally 3D game, I mean games like Gorn. I mean the Vision Pro is definitely powerful enough, but it just feels limited to tabletop games and AR games. Is this something Apple is thinking about implementing?
0
0
479
2w
ARKit Eye Tracking Calibration Issues - Word-Level Reading Tracking Feasibility
Hi Apple Developer Community, I'm developing an eye-tracking application using ARKit's ARFaceTrackingConfiguration and ARFaceAnchor.blendShapes for gaze detection using Xcode. I'm experiencing several calibration and accuracy issues and would appreciate insights from the community. Current Implementation Using ARFaceAnchor.blendShapes (.eyeLookUpLeft, .eyeLookDownLeft, .eyeLookInLeft, .eyeLookOutLeft, etc.) Implementing custom sensitivity curves and smoothing algorithms Applying baseline correction and coordinate mapping Using quadratic regression for calibration point mapping Issues I'm Facing 1. Calibration Mismatch Red dot position doesn't align with where I'm actually looking Significant offset between intended gaze point and actual cursor position Calibration seems to drift or become inaccurate over time 2. Extreme Eye Movement Requirements Need to make exaggerated eye movements to reach screen edges/corners Natural eye movements don't translate to proportional cursor movement Difficulty reaching certain screen regions even with calibration 3. Sensitivity and Stability Issues Cursor jitters or jumps around when looking at center Too much sensitivity to micro-movements Inconsistent behavior between calibration and normal operation 4. I also noticed that tracking on calibration screen as well as tracking on reading screen works better as expected when head movement is there, but I do not want much head movement. I want tracking with normal eye movement while reading an Ebook. Primary Question: Word-Level Eye Tracking Feasibility Is word-level eye tracking (tracking gaze as users read through individual words in an ebook) technically feasible with current iPhone/iPad hardware? I understand that Apple's built-in eye tracking is primarily an accessibility feature for UI navigation. However, I'm wondering if the TrueDepth camera and ARKit's eye tracking capabilities are sufficient for: Tracking natural reading patterns (left-to-right, line-by-line progression) Detecting which specific words a user is looking at Maintaining accuracy for sustained reading sessions (15-30 minutes) Working reliably across different users and lighting conditions Questions for the Community Hardware Limitations: Are iPhone/iPad TrueDepth cameras capable of the precision needed for word-level tracking, or is this beyond current hardware capabilities? Calibration Best Practices: What calibration strategies have worked best for accurate gaze mapping? How many calibration points are typically needed? Reading-Specific Challenges: Are there particular challenges when tracking reading behavior vs. general gaze tracking? Alternative Approaches: Are there better approaches than ARKit blend shapes for this use case? Current Setup Devices: iPhone 14 Pro iOS Version: iOS 18.3 ARKit Version: Latest available Any insights, experiences, or technical guidance would be greatly appreciated. I'm particularly interested in hearing from developers who have worked on similar eye tracking applications or have experience with the limitations and capabilities of ARKit's eye tracking features. Thank you for your time and expertise!
0
0
638
2w
Reoccurring World Tracking / Scene Exceeded Limit Error
Hi, We’ve been successfully using the RoomPlan API in our application for over two years. Recently, however, users have reported encountering persistent capture errors during their sessions. Specifically, the errors observed are: CaptureError.worldTrackingFailure CaptureError.exceedSceneSizeLimit What we have observed: Persistent Errors: The errors continue to occur even after initiating new capture sessions. Normal Usage: Our implementation adheres to typical usage patterns of the RoomPlan API without exceeding any documented room size limits. Limited Feature Usage: We are not utilizing the WorldTracking feature for the StructureBuilder functionality to stitch rooms together. Potential State Caching: Given that these errors persist across sessions, we suspect that there might be memory or state cached between sessions that is not being cleared, particularly since we are not taking advantage of StructureBuilder. Request: Could you please advise if there is any internal caching or memory retention between capture sessions that might lead to these errors? Additionally, we would appreciate guidance on how to clear or manage this state when the StructureBuilder feature is not in use. Here is a generalised version of our capture session initialization code to help diagnose the issue. struct RoomARCaptureView: UIViewRepresentable { typealias Handler = (CapturedRoom, Error?) -> Void @Binding var stop: Bool @Binding var done: Bool let completion: Handler? func makeUIView(context: Self.Context) -> RoomCaptureView { let view = RoomCaptureView(frame: .zero) view.delegate = context.coordinator view.captureSession.run(configuration: .init()) return view } func updateUIView(_ uiView: RoomCaptureView, context: Self.Context) { if stop { // Stop the session only once, multiple times causes issues with the final presentation uiView.captureSession.stop() stop = false done = true } } static func dismantleUIView(_ uiView: RoomCaptureView, coordinator: Self.Coordinator) { uiView.captureSession.stop() } func makeCoordinator() -> ARViewCoordinator { ARViewCoordinator(completion) } @objc(ARViewCoordinator) class ARViewCoordinator: NSObject, RoomCaptureViewDelegate { var completion: Handler? public required init?(coder: NSCoder) {} public func encode(with coder: NSCoder) {} public init(_ completion: Handler?) { super.init() self.completion = completion } public func captureView(shouldPresent roomDataForProcessing: CapturedRoomData, error: (Error)?) -> Bool { return true } public func captureView(didPresent processedResult: CapturedRoom, error: (Error)?) { completion?(processedResult, error) } } } Thank you for your assistance.
5
2
592
3w
Roomplan exceeded scene size limit error. (RoomCaptureSession.CaptureError.exceedSceneSizeLimit)
Error: RoomCaptureSession.CaptureError.exceedSceneSizeLimit Apple Documentation Explanation: An error that indicates when the scene size grows past the framework’s limitations. Issue: This error is popping up in my iPhone 14 Pro (128 GB) after a few roomplan scans are done. This error shows up even if the room size is small. It occurs immediately after I start the RoomCaptureSession after the relocalisation of previous AR session (in world tracking configuration). I am having trouble understanding exactly why this error shows and how to debug/solve it. Does anyone have any idea on how to approach to this issue?
1
1
813
3w
Shared/GroupImmersive Space – Query Local Device Transform
Hi, I am in the process of implementing SharePlay into our app. The shared experience opens an Immersive Space and we set systemCoordinator.configuration.supportsGroupImmersiveSpace = true Now visionOS establishes a shared coordinate space for the immersive space. From the docs: To achieve consistent positioning of RealityKit entities across multiple devices in an immersive space during a SharePlay session There are cases where we want to position content in front of the user (independent of the shared session, and for each user individually). Normally to do that we use the transform retrieved via worldTrackingProvider.queryDeviceAnchor.originFromAnchorTransform to position content in front of the user (plus some Z Offset and smooth interpolation). This works fine in non-SharePlay instances and the device transform is where I would expect it to be but during the FaceTime call deviceAnchor.originFromAnchorTransform seems to use the shared origin of the immersive space and then I end up with a transform that might be offset. Here is a video of the issue in action: https://streamable.com/205r2p The blue rect is place using AnchorEntity(.head, trackingMode: .continuous). This works regardless of the call and the entity is always placed based on the head position. The green rect is adjusted on every frame using the transform I get from worldTrackingProvider.queryDeviceAnchor. As you can see it's offset. Is there any way I can query query this transform locally for the user during a FaceTime call? Also I would like to know if it's possible to disable this automatic entity transform syncing behavior? Setting entity.synchronization = nil results in the entity not showing up at all. https://developer.apple.com/documentation/realitykit/synchronizationcomponent Is SynchronizationComponent only relevant for the legacy MultiPeerConnectivity approach? Thank you!
2
0
247
Oct ’25
ARKit with 422 pixel format and Apple Log colorspace
Hi, I’m trying to configure camera feed in ARKit to be in Apple Log color space. I can change Capture Device’s format to one that has Apple Log and I see one frame being in proper log-gray colors but then all AR tracking stops and tracking state hangs at “initializing”. In other combinations I see error “sensor failed to initialize” and session restarts with default format. I suspect that this is because normal AR capture formats are 420f, whereas ones that have Apple Log are 422. Could someone confirm if it’s even possible to run ARKit session with camera feed in a different pixel format? I’m trying it on iphone 15 pro
0
0
176
Sep ’25
RealityKit captureHighResolutionFrame from session is broken on iOS26?
A bit of background on what our app is doing: We have a RealityKit ARView session running. During this period we place objects in RealityKit. At some point user can "take photo" and we use session.captureHighResolutionFrame to capture a frame. We then use captured frame and frame.camera.projectPoint to project my objects back to 2D Issue we found is that on devices that have iOS26, first photo user takes and the first frame received from session.captureHighResolutionFrame gives incorrect CGPoint for frame.camera.projectPoint. If user takes the second photo with the same camera phostion, second frame received from session.captureHighResolutionFrame gives correct CGPoint for frame.camera.projectPoint I notices some difference between first and subsequent frames that i believe is corresponding with the issue. Yaw value of camera (frame.camera.eulerAngles.y) on first frame is not correct ( inconsistent with any subsequent frame) I also created a small example app and i followed Building an Immersive Experience with RealityKit example to create it. The issue exists in this app for iOS26, while iOS18.* has consistent values between first and subsequent captured frames. Note: The yaw value seems to differ more if we start session in portrait but take photo in landscape. Example result for 3 captured frames: Frame captured with yaw: 1.4855177402496338 Frame captured with yaw: -0.08803760260343552 Frame captured with yaw: -0.08179682493209839 Example code: class CustomARView: ARView, ARSessionDelegate { required init(frame: CGRect) { super.init(frame: frame) } required init?(coder decoder: NSCoder) { fatalError("init(coder:) has not been implemented")} func setup() { let singleTap = UITapGestureRecognizer(target: self, action: #selector(handleTap)) addGestureRecognizer(singleTap) } @objc func handleTap(_ gestureRecognizer: UIGestureRecognizer) { Task { do { let frame = try await session.captureHighResolutionFrame() print("Frame captured with yaw: \(Double(frame.camera.eulerAngles.y))") } catch { } } } } struct CustomARViewUIViewRepresentable: UIViewRepresentable { func makeUIView(context: Context) -> some UIView { let arView = CustomARView(frame: .zero) arView.setup() return arView } func updateUIView(_ uiView: UIViewType, context: Context) { } } struct ContentView: View { var body: some View { CustomARViewUIViewRepresentable() .frame(maxWidth: .infinity, maxHeight: .infinity) .ignoresSafeArea() } }
3
1
514
Sep ’25
Using ARKit Replay hangs forever on "Attaching to App"...
Hello, I'm trying to use Xcode's ARKit Session replay functionality. I have a capture I made using Reality Composer and when trying to use it with Xcode's replay functionality the installation and debugging process seems stalled forever. I've gotten it to work once so I know the capture file is functional but I have never gotten it to work a second time, even though I didn't change any settings. No amount of restarting Xcode, the Mac, or the iPhone seem to work. I have also tried cleaning build folders, reinstalling the app, and clearing DerivedData. I can confirm from the Xcode logs that the app installs correctly but the app never launches. If I unselect the checkbox for "ARKit Replay Data", the app launches and debugs nearly instantly. I have tried letting it "attach" for up to 10 minutes to no avail.
4
0
355
Sep ’25
RealityView doesn't free up memory after disappearing
Basically, take just the Xcode 26 AR App template, where we put the ContentView as the detail end of a NavigationStack. Opening app, the app uses < 20MB of memory. Tapping on Open AR the memory usage goes up to ~700MB for the AR Scene. Tapping back, the memory stays up at ~700MB. Checking with Debug memory graph I can still see all the RealityKit classes in the memory, like ARView, ARRenderView, ARSessionManager. Here's the sample app to illustrate the issue. PS: To keep memory pressure on the system low, there should be a way of freeing all the memory the AR uses for apps that only occasionally show AR scenes.
0
0
91
Sep ’25
Metal Compositor Service & Persona (VisionOS)
Hello, I'm currently trying to make a collaborative app. But it just works only on Reality View, when I tried to use Compositor Layer like below, the personas disappeared. ImmersiveSpace(id: "ImmersiveSpace-Metal") { CompositorLayer(configuration: MetalLayerConfiguration()) { layerRenderer in SpatialRenderer_InitAndRun(layerRenderer) } } Is there any potential solution too see Personas in Metal view? Thanks in advance!
2
0
723
Sep ’25