Spatial Computing

Discuss spatial computing on Apple platforms and how to design and build an entirely new universe of apps and games for Apple Vision Pro.

Discover visionOS

Reality Composer Pro

General

ARKit

All subtopics

Post

Replies

Boosts

Views

Activity

Hand Tracking Latency When UITextView Becomes Active in Vision Pro Immersive Space

I'm placing sphere at finger tip and updating its position as hand move. Finger joint tracking functions correctly, but I’ve observed noticeable latency in hand tracking updates whenever a UITextView becomes active. This lag happens intermittently during app usage, lasting about 5–10 seconds, after which the latency disappears and the sphere starts following the finger joints immediately. When I open the immersive space for the first time, the profiler shows a large performance spike upto 328%. After that, it stabilizes and runs smoothly. Note: I don’t observe any lag when CPU usage spikes to 300% (upon immersive view load) yet the lag still occurs even when CPU usage remains below 100%. I’m using the following code for hand tracking: private func processHandTrackingUpdates() async { for await update in handTracking.anchorUpdates { let handAnchor = update.anchor if handAnchor.isTracked { switch handAnchor.chirality { case .left: leftHandAnchor = handAnchor updateHandJoints(for: handAnchor, with: leftHandJointEntities) case .right: rightHandAnchor = handAnchor updateHandJoints(for: handAnchor, with: rightHandJointEntities) } } else { switch handAnchor.chirality { case .left: leftHandAnchor = nil hideAllJoints(in: leftHandJointEntities) case .right: rightHandAnchor = nil hideAllJoints(in: rightHandJointEntities) } } await MainActor.run { handTrackingData.processNewHandAnchors( leftHand: self.leftHandAnchor, rightHand: self.rightHandAnchor ) } } } And here’s the function I’m using to update the joint positions: private func updateHandJoints( for handAnchor: HandAnchor, with jointEntities: [HandSkeleton.JointName: Entity] ) { guard handAnchor.isTracked else { hideAllJoints(in: jointEntities) return } // Check if the little finger tip and intermediate base are both tracked. if let tipJoint = handAnchor.handSkeleton?.joint(.littleFingerTip), let intermediateBaseJoint = handAnchor.handSkeleton?.joint(.littleFingerIntermediateTip), tipJoint.isTracked, intermediateBaseJoint.isTracked, let pinkySphere = jointEntities[.littleFingerTip] { // Convert joint transforms to world space. let tipTransform = handAnchor.originFromAnchorTransform * tipJoint.anchorFromJointTransform let intermediateBaseTransform = handAnchor.originFromAnchorTransform * intermediateBaseJoint.anchorFromJointTransform // Extract positions from the transforms. let tipPosition = SIMD3<Float>(tipTransform.columns.3.x, tipTransform.columns.3.y, tipTransform.columns.3.z) let intermediateBasePosition = SIMD3<Float>(intermediateBaseTransform.columns.3.x, intermediateBaseTransform.columns.3.y, intermediateBaseTransform.columns.3.z) // Calculate the midpoint. let midpointPosition = (tipPosition + intermediateBasePosition) / 2.0 // Position the sphere at the midpoint and make it visible. pinkySphere.isEnabled = true pinkySphere.transform.translation = midpointPosition } else { // If either joint is not tracked, hide the sphere. jointEntities[.littleFingerTip]?.isEnabled = false } // Update the positions of all other hand joint spheres. for (jointName, entity) in jointEntities { if jointName == .littleFingerTip { // Already handled the pinky above. continue } guard let joint = handAnchor.handSkeleton?.joint(jointName), joint.isTracked else { entity.isEnabled = false continue } entity.isEnabled = true let jointTransform = handAnchor.originFromAnchorTransform * joint.anchorFromJointTransform entity.transform.translation = SIMD3<Float>(jointTransform.columns.3.x, jointTransform.columns.3.y, jointTransform.columns.3.z) } } I’ve attached both a profiler trace and a video recording from Vision Pro that clearly demonstrate the issue. Profiler: https://drive.google.com/file/d/1fDWyGj_fgxud2ngkGH_IVmuH_kO-z0XZ Vision Pro Recordings: https://drive.google.com/file/d/17qo3U9ivwYBsbaSm26fjaOokkJApbkz- https://drive.google.com/file/d/1LxTxgudMvWDhOqKVuhc3QaHfY_1x8iA0 Has anyone else experienced this behavior? My thought is that there might be some background calculations happening at the OS level causing this latency. Any guidance would be greatly appreciated. Thanks!

Spatial Computing General AR / VR RealityKit Reality Composer Pro visionOS

455

Sep ’25

Logitech Muse very buggy in Immersive

The samples shown in volumetric work great but moving to an immersive experience the pen physical buttons don't work when you're focusing to an entity with a collision.

Spatial Computing ARKit visionOS

971

Dec ’25

Using AVAsynchronousKeyValueLoading.load() on an AVAssetTrack gives an error

I'm seeing this error while attempting to compile my VisionOS app under Xcode 26. My existing code looks like: let (naturalSize, formatDescriptions, mediaCharacteristics) = try? await videoTrack.load(.naturalSize, .formatDescriptions, .mediaCharacteristics) This is now giving a compiler error: Type of expression is ambiguous without a type annotation I don't see that anything that was changed or deprecated in the latest version. Also loading the properties individually seems to work fine i.e.: let naturalSize = try? await videoTrack.load(.naturalSize) let formatDescriptions = try? await videoTrack.load(.formatDescriptions) let mediaCharacteristics = try? await videoTrack.load(.mediaCharacteristics)

Spatial Computing General Xcode AVFoundation

126

Jun ’25

ARKit Eye Tracking Calibration Issues - Word-Level Reading Tracking Feasibility

Hi Apple Developer Community, I'm developing an eye-tracking application using ARKit's ARFaceTrackingConfiguration and ARFaceAnchor.blendShapes for gaze detection using Xcode. I'm experiencing several calibration and accuracy issues and would appreciate insights from the community. Current Implementation Using ARFaceAnchor.blendShapes (.eyeLookUpLeft, .eyeLookDownLeft, .eyeLookInLeft, .eyeLookOutLeft, etc.) Implementing custom sensitivity curves and smoothing algorithms Applying baseline correction and coordinate mapping Using quadratic regression for calibration point mapping Issues I'm Facing 1. Calibration Mismatch Red dot position doesn't align with where I'm actually looking Significant offset between intended gaze point and actual cursor position Calibration seems to drift or become inaccurate over time 2. Extreme Eye Movement Requirements Need to make exaggerated eye movements to reach screen edges/corners Natural eye movements don't translate to proportional cursor movement Difficulty reaching certain screen regions even with calibration 3. Sensitivity and Stability Issues Cursor jitters or jumps around when looking at center Too much sensitivity to micro-movements Inconsistent behavior between calibration and normal operation 4. I also noticed that tracking on calibration screen as well as tracking on reading screen works better as expected when head movement is there, but I do not want much head movement. I want tracking with normal eye movement while reading an Ebook. Primary Question: Word-Level Eye Tracking Feasibility Is word-level eye tracking (tracking gaze as users read through individual words in an ebook) technically feasible with current iPhone/iPad hardware? I understand that Apple's built-in eye tracking is primarily an accessibility feature for UI navigation. However, I'm wondering if the TrueDepth camera and ARKit's eye tracking capabilities are sufficient for: Tracking natural reading patterns (left-to-right, line-by-line progression) Detecting which specific words a user is looking at Maintaining accuracy for sustained reading sessions (15-30 minutes) Working reliably across different users and lighting conditions Questions for the Community Hardware Limitations: Are iPhone/iPad TrueDepth cameras capable of the precision needed for word-level tracking, or is this beyond current hardware capabilities? Calibration Best Practices: What calibration strategies have worked best for accurate gaze mapping? How many calibration points are typically needed? Reading-Specific Challenges: Are there particular challenges when tracking reading behavior vs. general gaze tracking? Alternative Approaches: Are there better approaches than ARKit blend shapes for this use case? Current Setup Devices: iPhone 14 Pro iOS Version: iOS 18.3 ARKit Version: Latest available Any insights, experiences, or technical guidance would be greatly appreciated. I'm particularly interested in hearing from developers who have worked on similar eye tracking applications or have experience with the limitations and capabilities of ARKit's eye tracking features. Thank you for your time and expertise!

Spatial Computing General Swift Packages ARKit

744

Oct ’25

Setting immerstionStyle while in immersive space breaks all entities.

I have my immersive space set up like: ImmersiveSpace(id: "Theater") { ImmersiveTeleopView() .environment(appModel) .onAppear() { appModel.immersiveSpaceState = .open } .onDisappear { appModel.immersiveSpaceState = .closed } } .immersionStyle(selection: .constant(appModel.immersionStyle.style), in: .mixed, .full) Which allows me to set the immersive style while in the space (from a Picker on a SwiftUI window). The scene responds correctly but a lot of the functionality of my immersive space is gone after the change in style; in that I am no longer able to enable/disable entities (which I also have a toggles for in the SwiftUI window). I have to exit and reenter the immersive space to regain the ability to change the enabled state of my entities. My appModel.immersionStyle is inspired by the Compositor-Services demo (although I am using a RealityView) listed in https://developer.apple.com/documentation/CompositorServices/interacting-with-virtual-content-blended-with-passthrough and looks like this: public enum IStyle: String, CaseIterable, Identifiable { case mixedStyle, fullStyle public var id: Self { self } var style: ImmersionStyle { switch self { case .mixedStyle: return .mixed case .fullStyle: return .full } } } /// Maintains app-wide state @MainActor @Observable class AppModel { // Immersion Style public var immersionStyle: IStyle = .mixedStyle

Spatial Computing General visionOS

245

Oct ’25

Access a DepthMap while RoomCaptureSession

I'm capturing a room via RoomPlan API and would like to access the DepthMap(sceneDepth) or SmoothDepthMap(smoothedSceneDepth) from my own provided ARSession for RoomCaptureSession. But both depth maps are empty when handling the delegates. I have not found a solution yet. So is it even possible? Because i have not found any documentation of what RoomCaptureSession overwrites in the ARSession if I provide my own ARSession instance. Here is a example code snippet of what i'm trying to do: private let arSession = ARSession() private lazy var roomPlanCaptureSession = RoomCaptureSession(arSession: arSession) let arConfig = ARWorldTrackingConfiguration() //Create semantics for ARconfig which is used for ARSession var semantics: ARWorldTrackingConfiguration.FrameSemantics = [] if ARWorldTrackingConfiguration.supportsFrameSemantics(.sceneDepth) { semantics.insert(.sceneDepth) } if ARWorldTrackingConfiguration.supportsFrameSemantics(.smoothedSceneDepth) { semantics.insert(.smoothedSceneDepth) } arConfig.frameSemantics = semantics //set delegates roomPlanCaptureSession.delegate = self arSession.delegate = self //Check if device support for depthMap if ARWorldTrackingConfiguration.supportsFrameSemantics(.sceneDepth){ arSession.run(arConfig) } else{ print(".sceneDepth is unsupported.") } //run roomcapture scan config let captureConfig = RoomCaptureSession.Configuration() roomPlanCaptureSession.run(configuration: captureConfig) //trying to get sceneDepth public func session(_ session: ARSession, didUpdate frame: ARFrame) { print("session delegate capture: sceneDepth: \(String(describing: frame.sceneDepth))") //prints: session delegate capture: sceneDepth: nil also in this video from 2023 it is say that i can pass custom ARSession to my RoomPlan. Explore enhancements to RoomPlan - Video Quote 3:00: Here is the init and stop function in previous RoomPlan. And here is how you pass over a custom ARSession to init function. Any custom ARSession with ARWorldTrackingConfiguration will be honored inside RoomCaptureSession. anyway I welcome any input. maybe im doing something wrong. :)

Spatial Computing ARKit ARKit RoomPlan

494

Dec ’25

VisionOS 26 - threadsPerThreadgroup limit causing crash on device (but not in simulator)

Hi all, I'm running into an issue with an app that previously worked fine on device using visionOS 2.0. After updating to visionOS 26, the same code runs fine in the simulator but crashes on the device with the following error: -[MTLDebugComputeCommandEncoder _validateThreadsPerThreadgroup:]:1330: failed assertion `(threadsPerThreadgroup.width(32) * threadsPerThreadgroup.height(32) * threadsPerThreadgroup.depth(1))(1024) must be <= 832. (kernel threadgroup size limit)` Is there any documented way to check or increase the allowed threadsPerThreadgroup size on Apple Vision Pro? Or any recommended workaround for this regression? Thanks in advance!

Spatial Computing ARKit Metal MetalKit Metal Performance Shaders visionOS

225

Jun ’25

ARFrame.sceneDepth not correctly registered with ARFrame.capturedImage for iPad Pro (6th Gen) for high resolution capture.

Hi team, I believe I’ve found a registration issue between ARFrame.sceneDepth and ARFrame.capturedImage when using high-resolution frame capture on a 2022 iPad Pro (6th gen). When enabling high-resolution capture: if let highResFormat = ARWorldTrackingConfiguration.recommendedVideoFormatForHighResolutionFrameCapturing { config.videoFormat = highResFormat } … arView.session.captureHighResolutionFrame { ... } the depth map provided by ARFrame.sceneDepth no longer aligns correctly with the corresponding high-resolution capturedImage. This misalignment results in consistently over-estimated distance measurements in my app (which relies on mapping depth to 2D pixel coordinates). iPad Pro (6th gen): misalignment occurs only when capturing high-resolution frames. iPhone 16 Pro: depth is correctly registered for both standard and high-resolution captures. It appears the camera intrinsics, specifically the FOV, change between the “regular” resolution stream and the high-resolution capture on the iPad. My suspicion is that the depth data continues using the intrinsics of the lower resolution stream, resulting in an unregistered depth-to-RGB mapping. Once I have the iPad in hand again, I will confirm whether camera.intrinsics or FOV differ between the low-res and high-res frames. Is this a known issue with high-resolution frame capture on the 2022 iPad Pro? If not, I’m happy to provide some more thorough sample code. Thanks for your time!

Spatial Computing ARKit ARKit RealityKit

260

Nov ’25

ARKit with 422 pixel format and Apple Log colorspace

Hi, I’m trying to configure camera feed in ARKit to be in Apple Log color space. I can change Capture Device’s format to one that has Apple Log and I see one frame being in proper log-gray colors but then all AR tracking stops and tracking state hangs at “initializing”. In other combinations I see error “sensor failed to initialize” and session restarts with default format. I suspect that this is because normal AR capture formats are 420f, whereas ones that have Apple Log are 422. Could someone confirm if it’s even possible to run ARKit session with camera feed in a different pixel format? I’m trying it on iphone 15 pro

Spatial Computing ARKit ARKit Camera AVFoundation

219

Sep ’25

how to transition between spatial3d to spatial3DImmersive?

Hi, When viewing a spatial photo scene on the Apple Vision Pro Photos app, you can tap on the immersive icon on the top right corner to transaction from the window presenting the image as spatial3d to an immersive photo scene with spatial3DImmersive where the window borders disappear. Could someone explain how to achieve that? I tried to do it but once I transition from spatial3d to spatial3DImmersive I can see still see a rectangle around the spatial image. Thanks.

Spatial Computing General visionOS

887

Nov ’25

Immersive environment learning material

I really love the immersive environments, but I don’t have experience with creating them. Do you have resources or tutorials you can recommend for creating these from scratch? I’ve seen the sample projects and videos, but they usually start in the middle, assuming you already have the assets created.

Spatial Computing General

Jul ’25

RealityRenderer's Perspective Camera's FOV

Hi, I have been using RealityRenderer to render scenes in MacOS as spatial videos and view it in Vision Pro and it is awesome. I understand that it uses PerspectiveCamera to render. I wanted to know what is the default FOV for this camera and how much can we push it? I want to ideally render a scene with 180 degrees of fov. Thanks

Spatial Computing ARKit RealityKit

129

May ’25

Cannot reassign worldTracking / planeDetection providers in my PlacementManager when switching environments

Environment Xcode: 16.2 VisionOS SDK 2.4 Swift 6.1 Targets: Apple Vision Pro (immersive space) Frameworks: ARKit, RealityKit, SwiftUI What I’m Trying to Do I have a view-model class PlacementManager that holds two AR providers: private var worldTracking: WorldTrackingProvider private var planeDetection: PlaneDetectionProvider I want to dynamically replace these providers in a setEnvironment(_:) method (so I can save/clear a JSON scene and restart ARKit). What’s Happening If I declare them as : private let worldTracking = WorldTrackingProvider() private let planeDetection = PlaneDetectionProvider() I get compile-errors when I later do: self.worldTracking = newWorldTracking // Cannot assign to property: 'worldTracking' is a 'let' constant If I change them to un-initialized vars: private var worldTracking: WorldTrackingProvider private var planeDetection: PlaneDetectionProvider then in my init() I get: self used in property access 'worldTracking' before all stored properties are initialized Code snipet @Observable final class PlacementManager : ObservableObject { private var worldTracking: WorldTrackingProvider private var planeDetection: PlaneDetectionProvider // … other props … @MainActor init() { // error: self.worldTracking used before init… planeAnchorHandler = PlaneAnchorHandler(rootEntity: root) persistenceManager = PersistenceManager( worldTracking: worldTracking, rootEntity: root ) // … } @MainActor func setEnvironment(env: Environnement) async { let newWorldTracking = WorldTrackingProvider() let newPlaneDetection = PlaneDetectionProvider() try await appState!.arkitSession.run( [ newWorldTracking, newPlaneDetection ] ) self.worldTracking = newWorldTracking self.planeDetection = newPlaneDetection // … } } What I’ve Tried Giving them default values at declaration (= WorldTrackingProvider()) Initializing them at the top of init() before any use Passing the new providers into arkitSession.run(...) My Question What is the recommended Swift-style pattern to declare and reassign these ARKit provider properties so that: They’re fully initialized before use in init(), and I can swap them out later in setEnvironment(...) without compiler errors? Any pointers (or links to forum threads / docs) would be greatly appreciated!

Spatial Computing General ARKit SwiftUI RealityKit visionOS

163

May ’25

ARKit / visionOS - handtracking with 3D objects attached on hand

I use ARKit's hand tracking to attach a 3D model of a remote control to the left hand. The user is supposed to press buttons on the remote control. In the Vision Pro settings, I have removed the left hand from Hands & Eye Tracking. Only the right hand is used. The problem now is that the left hand appears and the 3D model of the remote control fades out. I want the remote control to be completely visible. The user should feel like they really have the remote control in their hand. Can I prevent the fading out?

Spatial Computing ARKit ARKit USDZ visionOS

294

Nov ’25

some meet with apple videos not appearing on developer app, but in YouTube

Hi, Few days ago, Apple had another "Meet with Apple" event regarding immersive video : Create immersive media experiences for visionOS | Meet with Apple days one and 2 and it is not appearing on my Developer app in Mac. The same happened to IETF HLS Interest Day | Meet with Apple. Any tip? Kind regards , Bruno

Spatial Computing General

303

Nov ’25

Shared/GroupImmersive Space – Query Local Device Transform

Hi, I am in the process of implementing SharePlay into our app. The shared experience opens an Immersive Space and we set systemCoordinator.configuration.supportsGroupImmersiveSpace = true Now visionOS establishes a shared coordinate space for the immersive space. From the docs: To achieve consistent positioning of RealityKit entities across multiple devices in an immersive space during a SharePlay session There are cases where we want to position content in front of the user (independent of the shared session, and for each user individually). Normally to do that we use the transform retrieved via worldTrackingProvider.queryDeviceAnchor.originFromAnchorTransform to position content in front of the user (plus some Z Offset and smooth interpolation). This works fine in non-SharePlay instances and the device transform is where I would expect it to be but during the FaceTime call deviceAnchor.originFromAnchorTransform seems to use the shared origin of the immersive space and then I end up with a transform that might be offset. Here is a video of the issue in action: https://streamable.com/205r2p The blue rect is place using AnchorEntity(.head, trackingMode: .continuous). This works regardless of the call and the entity is always placed based on the head position. The green rect is adjusted on every frame using the transform I get from worldTrackingProvider.queryDeviceAnchor. As you can see it's offset. Is there any way I can query query this transform locally for the user during a FaceTime call? Also I would like to know if it's possible to disable this automatic entity transform syncing behavior? Setting entity.synchronization = nil results in the entity not showing up at all. https://developer.apple.com/documentation/realitykit/synchronizationcomponent Is SynchronizationComponent only relevant for the legacy MultiPeerConnectivity approach? Thank you!

Spatial Computing General ARKit RealityKit Group Activities

397

Oct ’25

For a third year, no screenshot capability for immersive visionOS apps... here's a workaround?

Since only the user can take a screenshot using the Apple Vision Pro's top buttons, the only workaround available to an immersive app that needs a screenshot to document the user's creative interior design choices is ask the user to take a screenshot wait until the user taps a button indicating the screenshot has been taken then the app asks the user to select the screenshot when the app opens the PhotoPicker when the user presses Done, the screenshot is handed off to the app. One wonders why there is no Apple Api for doing this in a simple privacy protective way such as: When called, the Apple api captures the screenshot in Apple secured memory The api displays the screenshot to the user with appropriate privacy warnings and asks if the user wants to a. share this screenshot with the app, or b. cancel, c. retake the screenshot If the user approves, the app receives the screenshot

Spatial Computing General SwiftUI visionOS

133

Jun ’25

缺失AR_WORLD_MAP

SharePlay objects are not placed in the same place in the same space. I hope they can be placed in the same place. （Vision Pro）

Spatial Computing ARKit

110

Apr ’25

RoomCaptureView runs on the latest system with a serious bug causing errors

I am using RoomCaptureView for house scanning and modeling On the latest iOS 26, the following issues were encountered The program runs well on iOS 26 and below, but on iOS 26 and above, the probability of scene localization failure becomes abnormally high, and accurate indoor localization cannot be obtained. Additionally, the probability of using RoomBuilder to merge models is also high After compiling the program using xcode 26 or above, a necessary bug appeared when running it on iOS 26, RoomCaptureView is completely unable to run The error message is {RoomCaptureSession. Capture Error's' Internal error '} And the camera interface of RoomCaptureView has turned into a splash screen Another Debug error occurred:{ -[MTLDebugRenderCommandEncoder validateCommonDrawErrors:]:5970: failed assertion `Draw Errors Validation Fragment Function(realitykit::fsSurfaceShadow): incorrect type of texture (MTLTextureType2D) bound at Texture binding at index 14 (expect MTLTextureType1D) for tonemapLUT[0]. } When using programs compiled under xcode 26 and running on iOS 26, this issue will not occur

Spatial Computing General RoomPlan

230

how make an APN message

I like to compose an APN message. (using FCM) what shall I do for it?

Spatial Computing ARKit

225

Jul ’25

Hand Tracking Latency When UITextView Becomes Active in Vision Pro Immersive Space

Spatial Computing General AR / VR RealityKit Reality Composer Pro visionOS

Replies: 0
Boosts: 0
Views: 455
Activity: Sep ’25

Logitech Muse very buggy in Immersive

The samples shown in volumetric work great but moving to an immersive experience the pen physical buttons don't work when you're focusing to an entity with a collision.

Spatial Computing ARKit visionOS