Integrate iOS device camera and motion features to produce augmented reality experiences in your app or game using ARKit.

ARKit Documentation

Posts under ARKit subtopic

Post

Replies

Boosts

Views

Activity

ARDepthData.confidenceMap only returns low confidence on certain devices
A few users have recently reported no longer being able to capture point clouds using our app, specifically on iPhone 15 Pro devices. We recently found an in-house device that exhibits this behavior and found that the confidenceMap contains only low confidence values, regardless of the environment being captured. Our app uses a higher confidence threshold; setting the threshold to a lower value produces noisy results as expected, so that is a non-viable option. Other LiDAR based apps have been tested with this device and the results are the same. No points, or noisy point clouds in apps that allow a lower confidence threshold setting. On devices that exhibit this behavior the "Displaying a point cloud using scene depth" Apple sample app can be used to visualize the issue. First reports of this new behavior occurred as early as iOS 18.4. Looking for recommendations on which team(s) at Apple to reach out to with these findings since the behavior manifests on only a small sample of devices.
1
0
244
Jun ’25
ARKit camera transform orientation vector doesn't match physical device heading (despite `.gravityAndHeading`)
Hi all, I'm working on an ARKit-based iOS app where I need to accurately determine the direction the device is facing to localize objects in the real world. I'm using: let config = ARWorldTrackingConfiguration() config.worldAlignment = .gravityAndHeading Thus, I would expect the world alignment to behave as given in the gravityAndHeading page. The AR session is started after verifying that CLLocationManager.headingAccuracy <= 20, and the compass appears to be calibrated. However, I'm seeing a major inconsistency: When the rear camera is physically pointed toward true North, I would expect: cameraTransform.columns.2.z ≈ -1 // (i.e. ARKit's -Z pointing North) But instead, I'm consistently seeing: cameraTransform.columns.2.z ≈ +0.97 // Implies camera is facing South Meanwhile, the translation vector behaves as expected: As I physically move North, cameraTransform.columns.3.z becomes more negative, matching the world’s +Z = South assumption. For example, let's say I have the device in landscapeRight (or landscapeLeft for UIDeviceOrientation). Let's say the device rear camera is pointing towards True North, and I start moving towards True North. I get something like this: Camera Transform = simd_float4x4( [ [0.98446155, -0.030119859, 0.172998, 0.0], [0.023979114, 0.9990097, 0.037477385, 0.0], [-0.17395553, -0.032746706, 0.98420894, 0.0], [0.024039675, -0.037087332, -0.22780673, 0.99999994] ]) As you can see, the cameraTransform.columns.2.z is positive despite the rear camera pointing towards True North, while cameraTransform.columns.3.z is correctly positive as the device is moving towards True North. So here is my question: Why is cameraTransform.columns.2.z positive when the rear camera is physically facing North? Any clarity would be deeply appreciated. I've read the documentation and tested with different heading accuracies and AR session resets, but I keep running into this orientation mismatch. Thanks in advance!
1
0
127
Jun ’25
Possible contradiction between ARKit's definition of UIDeviceOrientation.landscapeRight and the actual definition of UIDeviceOrientation.landscapeRight
So it seems to be that there is a contradiction between how ARKit defines UIDeviceOrientation.landscapeRight, and the actual definition of UIDeviceOrientation.landscapeRight in the UIKit documentation. In the ARKit documentation for ARCamera.transform, it says the following: This transform creates a local coordinate space for the camera that is constant with respect to device orientation. In camera space, the x-axis points to the right when the device is in UIDeviceOrientation.landscapeRight orientation—that is, the x-axis always points along the long axis of the device, from the front-facing camera toward the Home button. The y-axis points upward (with respect to UIDeviceOrientation.landscapeRight orientation), and the z-axis points away from the device on the screen side. Going through the same link, we see the definition of UIDeviceOrientation.landscapeRight given as: The device is in landscape mode, with the device held upright and the front-facing camera on the right side. There seems to be a conflict in the two definitions, that has already been asked and visualized in this StackOverflow thread The resolution of that answer says that ARKit landscapeRight, unlike what is given in UIDeviceOrientation.landscapeRight, has home button on the right, as stated in the ARCamera.transform documentation. It says that more details are given in this StackOverflow thread, but this thread talks about the discrepancy between the definitions of landscapeRight in UIDeviceOrientation and UIInterfaceOrientation, and not anything related to ARKit. So I am wondering, why does ARKit definition of landscapeRight contradict with that of UIDeviceOrientation despite explicitly mentioning it? Is it just a mistake by Apple developers that hasn't been resolved even after so long?
1
0
92
Jun ’25
I need to troubleshoot Transform Drift in ARKit
Hi all, I'm currently developing a real-time object reconstruction app using ARKit. The goal is to scan large objects using ARKit’s depth and transform data, and generate a point cloud. However, I’m facing a major challenge - Transform Drift / World Alignment Issues The localToWorld transform provided by ARKit frequently seems to drift or become unstable across frames. This results in misaligned point clouds even when the device is moved slowly or kept relatively still. In some cases, a static surface scanned over a few seconds results in clearly misaligned fragments. This makes it difficult to accurately stitch a multi-frame point cloud. I have experimented with various lighting conditions and object textures, but the issue persists in all cases. At times, the relative error between frames reaches up to 20 cm, while in other instances the error is minimal; however, the drift gradually accumulates over time, leading to an overall enlargement of the reconstructed object. I have attached images of both cases here. Questions: Are there specific conditions under which ARKit’s world transform is expected to drift? Is there a way to detect or recover from this drift during runtime? Any best practices for maintaining consistent tracking during scanning or measurement sessions?
1
0
120
Jun ’25
Keyboard Tracking
Hi! I'm currently experimenting on Apple Vision Pro with hand and head anchors. Is there a way to get an anchor linked to the apple magic keyboard (as the detection is already done to display inputs at the top)? Thanks in advance, Have a good day!
1
0
647
Jul ’25
visionOS plane anchor rotation and wall direction are inconsistent
I have a problem with the wall plane detection using visionOS/ARKit: I am using ARKitSession's PlaneDetectionProvider detection.wall in the space of visionOS. I recorded the position and rotation information of the first detected plane, but found that the rotation value will be facing when the user starts the space. There is a deviation in different directions. That is to say, even if the plane is located on the same wall, the rotation quaternion will be different. I hope that no matter from which direction the user enters the scan, the real direction of the wall can be correctly obtained so that the virtual content can be accurately aligned with the wall. I have tried to use anchor.originFromAnchorTransform or Transform.rotation directly, but the rotation value is still affected by the user's initial orientation. In addition, I would like to know whether the user's initial orientation will affect the location information. If so, please provide a solution. Thank you!
1
0
510
Sep ’25
Current Apple Forum about ARKit and visionOS
Recently, questions about ARKit/visionOS seem to be being asked in the Apple forum by internal Apple engineers. Inexperienced and untested makeshift features are being offered, putting average but experienced developers in a difficult position. They are unable to react and get something useful from the posts. Apple needs to review the situation.
1
0
352
Sep ’25
How to best manage ARKitSession in concurrent code
I have a visionOS app where I instantiate ARKitSession and various providers (HandTrackingProvider and WorldTrackingProvider) in my appModel. That way, I can pass these providers to a Task which runs a gRPC server for sending the data from these providers to a client. When the users enters the immersive space of the app, the ARKitSession will run the providers if they are not running already. I am now trying to implement the AccessoryTrackingProvider with the PSVR sense controllers but it does not fit with my current framework because the controllers may not be connected when the ARKitSession.run function is called. So I need to find a new place to start the session. My question is, if I already have a session which is running the hand and world tracking providers, can I start another session to run the accessory tracking? Should they all be running on the same session? Is there a way to stop the session and restart it when the controllers are connected? When I tried this, I get an error that says "It is not possible to re-run a stopped data provider (<ar_hand_tracking_provider_t: " but if I instantiate a new HandTrackingProvider, then the one that got passed to the gRPC task would no longer be the one running in the new session. Any advice on how best to manage the various providers and ARKit sessions would be greatly appreciated.
1
0
222
Nov ’25
visionOS 3d interactions like the native keyboard when no longer observed in passthrough
While using apple's vision pro, we noticed that we can continue to use the visionOS keyboard when we no longer actually see it in passthrough. In other words, when we focus on a field to type, visionOS displays the keyboard for us in such a way that we actually see it. Then, we noticed if we look away a little bit, either up, or down, or left, or right, in such a way that the keyboard is no longer visible by us in the passthrough, the keyboard still remains responsive to taps from our fingers at the location where it is. It seems the keyboard remains functional and responsive to taps even though we can no longer observe/see it. We are trying to figure out how to implement similar functionality in our app whereby the user can continue to manipulate a 3d entity when the user can no longer actually observe it in passthrough (like the visionOS keyboard appears to allow). I assume the visionOS keyboard has this functionality thanks to the downward facing sensors on the hardware that allow hand tracking even though the hands can no longer be observed by the user. That is likely how we can rest our hands on our lap is still be able to interact with visionOS. How can we implement a similar functionality for 3D entities? Is there a way to tap in, or to allow hand tracking, from those toward facing cameras? Is it possible to manipulate a 3D entity when it is no longer observed by the user for example when they shift their attention somewhere else in the field of vision? How does the visionOS keyboard achieve this?
1
0
308
Nov ’25
ARKit / visionOS - handtracking with 3D objects attached on hand
I use ARKit's hand tracking to attach a 3D model of a remote control to the left hand. The user is supposed to press buttons on the remote control. In the Vision Pro settings, I have removed the left hand from Hands & Eye Tracking. Only the right hand is used. The problem now is that the left hand appears and the 3D model of the remote control fades out. I want the remote control to be completely visible. The user should feel like they really have the remote control in their hand. Can I prevent the fading out?
1
0
209
Nov ’25
Best approach for high-quality textured room reconstruction using ARKit / RoomPlan / Object Capture?
I am developing an IOS App that allow users to scan rooms, view the scans on device, and add notes. I need to preserve actual geometry (odd angles, chamfers, fixtures), not simplified RoomPlan boxes. Are there any easy ways to incorporate high quality texture mapping or PBR? Where is the documentation for scene reconstruction?
1
0
846
Nov ’25
Object tracking capability not available
Hi there, I received an enterprise license file to include enhanced object tracking configuration for the Vision Pro. My account is part of the team which got the allowance from Apple to use this capability. Unfortunately, although I followed the guide, I do not find the Object Tracking capability when I try to add it to my project. There are other capabilities like Main Camera on the Vision Pro, but not for Object Tracking. I am using Xcode 26.1 and visionOS 26.1. What am I missing here? Thanks in advance, Matthias
1
0
276
Dec ’25
Access UltraWideCamera when ARSession is running
ARSession provides video stream from the wide angle camera. If ARSession uses the ultra wide camera at the same time, ARSession may provide video stream from that camera, otherwise AVCaptureSession with an ultra wide camera should be allowed to launch. It would be very very useful if we can access different cameras while ARSession is running. We'd like to cooperate with you if possible. Steps to reproduce: run an AVCaptureSession and then run an ARSession. The AVCaptureSession stops.
1
0
432
5d
Combining ARKit Face Tracking with High-Resolution AVCapture and Perspective Rendering on Front Camera
Subject: Combining ARKit Face Tracking with High-Resolution AVCapture and Perspective Rendering on Front Camera Message: Hello Apple Developer Community, We’re developing an application using the front camera that requires both real-time ARKit face tracking/guidance and the capture of high-resolution still images via AVCaptureSession. Our goal is to leverage ARKit’s depth and face data to render a captured image from another perspective post-capture, maintaining high image quality. Our Approach: Real-Time ARKit Guidance: Utilize ARKit (e.g., ARFaceTrackingConfiguration) for continuous face tracking, depth, and scene understanding to guide the user in real time. High-Resolution Capture Transition: At the moment of capture, we plan to pause the ARKit session and switch to an AVCaptureSession to take a high-resolution image. We assume that for a front-facing image, the subject’s face is directly front-on, and the relative pose between the face and camera remains the same during the transition. The only variation we expect is a change in distance. Our intention is to minimize the delay between the last ARKit frame and the high-res capture to maintain temporal consistency, assuming that aside from distance, the face-camera relative pose remains unchanged. Post-Processing Perspective Rendering: Using the last ARKit face data (depth, pose, and landmarks) along with the high-resolution 2D image, we aim to render the scene from another perspective. We want to correct the perspective of the 2D image using SceneKit or RealityKit, leveraging the collected ARKit scene information to achieve a natural, high-quality rendering from a different viewpoint. The rendering should match the quality of a normally captured high-resolution image, adjusting for the difference in distance while using the stored ARKit data to correct perspective. Our Questions: Session Transition Best Practices: What are the recommended best practices to seamlessly pause ARKit and switch to a high-resolution AVCapture session on the front camera How can we minimize user movement or other issues during this brief transition, given our assumption that the face-camera pose remains largely consistent except for distance changes? Data Integration for Perspective Rendering: How can we effectively integrate stored ARKit face, depth, and pose data with the high-res image to perform accurate perspective correction or rendering from another viewpoint? Given that we assume the relative pose is constant except for distance, are there strategies or APIs to leverage this assumption for simplifying the perspective transformation? Perspective Correction with SceneKit/RealityKit: What techniques or workflows using SceneKit or RealityKit are recommended for correcting the perspective of a captured 2D image based on ARKit scene data? How can we use these frameworks to render the high-resolution image from an alternative perspective, while maintaining image quality and fidelity? 4. Pitfalls and Guidelines: What common pitfalls should we be aware of when combining ARKit tracking data with high-res capture and post-processing for perspective rendering? Are there performance considerations, recommended thresholds for acceptable temporal consistency, or validation techniques to ensure the ARKit data remains applicable at the moment of high-res capture? We appreciate any advice, sample code references, or documentation pointers that could assist us in implementing this workflow effectively. Thank you!
2
0
771
Jan ’25
Tracking geographic locations in AR - Sample App Code
Hello, I was looking back into downloading the Tracking geographic locations in AR sample app from https://developer.apple.com/documentation/arkit/tracking-geographic-locations-in-ar Unfortunately the Download links to the .zip of the DisplayingAPointCloudUsingSceneDepth sample project. The exact same issue occurs when trying to download the sample code from https://developer.apple.com/documentation/ARKit/creating-a-fog-effect-using-scene-depth Wondering if those links are deliberately broken because of possible deprecations. Thanks to any Apple Engineer willing to look into that.
2
0
452
Feb ’25
Entity HoverEffect Fired When inside Another Entity Collider
Hi folks, I’m new to Vision Pro stack, still trying to learn all the nuances. Here is a problem I can’t seem to find an answer. I placed entity A( a small .02 radius sphere) inside entity B( size:.1 box). Both entities have HoverEffectComponent, and both inputcomponent is set to .direct. Entity A is NOT a child of Entity B. When I direct touch Entity B, I noticed that Entity A’s hover effect is fired as well. This only happens if Entity A‘s position is inside Entity B. The gesture that is only targeted at Entity A doesn’t work either. I double checked Entity A collider which sits inside entity B collider, my direct touch shouldn’t have trigger its hove effect. Having one collider inside another seems to produce unpredictable behavior? Thanks in advance 🙏🙏🙏 Context: I’m trying to create an invisible bound around Entity A, so when my hand approaches the bound to grab Entity A, a nice spotlight hover effect would fire first on the bound before hand reaching entity A.
2
0
353
Feb ’25
Difference in ARKit plane detection from iPhone 8 to iPhone 15
I am developing an ARKit based application that requires plane detection of the tabletop at which the user is seated. Early testing was with an iPhone 8 and iPhone 8+. With those devices, ARKit rapidly detected the plane of the tabletop when it was only 8 to 10 inches away. Using iPhone 15 with the same code, it seems to require me to move the phone more like 15 to 16 inches away before detecting the plane of the table. This is an awkward motion for a user seated at a table. To validate that it was not necessarily a feature of my code, I determined that the same behavior results with Apple's sample AR Interaction application. Has anyone else experienced this, and if so, have suggestions to improve the situation?
2
0
511
Feb ’25
how to convert mlmodel to reference object?
Hello, I have downloaded and run the sample object tracking app for visionos. Now I'm working on my own objects for tracking. I have made a model using Create ML using images of my object. However, I cannot see how to convert the Create ML output file (xxx.mlmodel) into a reference object like the files in the sample project. is there a tool for converting them? TIA
2
0
354
Feb ’25