Discuss spatial computing on Apple platforms and how to design and build an entirely new universe of apps and games for Apple Vision Pro.

All subtopics
Posts under Spatial Computing topic

Post

Replies

Boosts

Views

Activity

ARKit Eye Tracking Calibration Issues - Word-Level Reading Tracking Feasibility
Hi Apple Developer Community, I'm developing an eye-tracking application using ARKit's ARFaceTrackingConfiguration and ARFaceAnchor.blendShapes for gaze detection using Xcode. I'm experiencing several calibration and accuracy issues and would appreciate insights from the community. Current Implementation Using ARFaceAnchor.blendShapes (.eyeLookUpLeft, .eyeLookDownLeft, .eyeLookInLeft, .eyeLookOutLeft, etc.) Implementing custom sensitivity curves and smoothing algorithms Applying baseline correction and coordinate mapping Using quadratic regression for calibration point mapping Issues I'm Facing 1. Calibration Mismatch Red dot position doesn't align with where I'm actually looking Significant offset between intended gaze point and actual cursor position Calibration seems to drift or become inaccurate over time 2. Extreme Eye Movement Requirements Need to make exaggerated eye movements to reach screen edges/corners Natural eye movements don't translate to proportional cursor movement Difficulty reaching certain screen regions even with calibration 3. Sensitivity and Stability Issues Cursor jitters or jumps around when looking at center Too much sensitivity to micro-movements Inconsistent behavior between calibration and normal operation 4. I also noticed that tracking on calibration screen as well as tracking on reading screen works better as expected when head movement is there, but I do not want much head movement. I want tracking with normal eye movement while reading an Ebook. Primary Question: Word-Level Eye Tracking Feasibility Is word-level eye tracking (tracking gaze as users read through individual words in an ebook) technically feasible with current iPhone/iPad hardware? I understand that Apple's built-in eye tracking is primarily an accessibility feature for UI navigation. However, I'm wondering if the TrueDepth camera and ARKit's eye tracking capabilities are sufficient for: Tracking natural reading patterns (left-to-right, line-by-line progression) Detecting which specific words a user is looking at Maintaining accuracy for sustained reading sessions (15-30 minutes) Working reliably across different users and lighting conditions Questions for the Community Hardware Limitations: Are iPhone/iPad TrueDepth cameras capable of the precision needed for word-level tracking, or is this beyond current hardware capabilities? Calibration Best Practices: What calibration strategies have worked best for accurate gaze mapping? How many calibration points are typically needed? Reading-Specific Challenges: Are there particular challenges when tracking reading behavior vs. general gaze tracking? Alternative Approaches: Are there better approaches than ARKit blend shapes for this use case? Current Setup Devices: iPhone 14 Pro iOS Version: iOS 18.3 ARKit Version: Latest available Any insights, experiences, or technical guidance would be greatly appreciated. I'm particularly interested in hearing from developers who have worked on similar eye tracking applications or have experience with the limitations and capabilities of ARKit's eye tracking features. Thank you for your time and expertise!
0
0
719
Oct ’25
ManipulationComponent in both parent and child entities
Hello, In my project, I have attached a ManipulationComponent to Entity A and as expected, I'm able interact with it using the built-in gestures. I have another Entity B which is a child of A that I would like to interact with as well, so I attempted to add a ManipulationComponent to B. However, no gestures seem to be registered on B; I can still interact with A but B cannot be interacted with despite having ManipulationComponents on both entities. So I'm wondering if I'm just doing something wrong, if this is an issue with the ManipulationComponent, or if this is a limitation of the API. Attached is the code used to add the ManipulationComponent to an Entity and it was done on both A and B: let mc = ManipulationComponent() model.components.set(mc) var boxShape = ShapeResource.generateBox(width: 0.25, height: 0.05, depth: 0.25) boxShape = boxShape.offsetBy(translation: simd_float3(0, -0.05, -0.25)) ManipulationComponent.configureEntity(model, collisionShapes: [boxShape]) if var mc = model.components[ManipulationComponent.self] { mc.releaseBehavior = .stay mc.dynamics.inertia = .low model.components.set(mc) } I am using visionOS 26.0; let me know if there's any additional information needed.
1
0
362
Oct ’25
Setting clip shape of a RealityView
I am following this example to create a stereoscopic image: https://developer.apple.com/documentation/visionos/creating-stereoscopic-image-in-visionos I would also like to add corner radius to the stereoscopic RealityView. With ordinary SwiftUI views, we typically just use .clipShape(RoundedRectangle(cornerRadius: 32)): struct StereoImage: View { var body: some View { let spacing: CGFloat = 10.0 let padding: CGFloat = 40.0 VStack(spacing: spacing) { Text("Stereoscopic Image Example") .font(.largeTitle) RealityView { content in let creator = StereoImageCreator() guard let entity = await creator.createImageEntity() else { print("Failed to create the stereoscopic image entity.") return } content.add(entity) } .frame(depth: .zero) } .padding(padding) .clipShape(RoundedRectangle(cornerRadius: 32)) // <= HERE! } } This doesn't seem to actually clip the RealityView shown in the sample above. I am guessing this is due to the fact that the box in the RealityView has a non-zero z scale, which means it isn't on the same "layer" as its SwiftUI containers, and thus isn't clipped by the modifiers apply to the containers. How can I properly apply a clipshape to RealityViews like this? Thanks!
3
0
486
Feb ’25
Setting immerstionStyle while in immersive space breaks all entities.
I have my immersive space set up like: ImmersiveSpace(id: "Theater") { ImmersiveTeleopView() .environment(appModel) .onAppear() { appModel.immersiveSpaceState = .open } .onDisappear { appModel.immersiveSpaceState = .closed } } .immersionStyle(selection: .constant(appModel.immersionStyle.style), in: .mixed, .full) Which allows me to set the immersive style while in the space (from a Picker on a SwiftUI window). The scene responds correctly but a lot of the functionality of my immersive space is gone after the change in style; in that I am no longer able to enable/disable entities (which I also have a toggles for in the SwiftUI window). I have to exit and reenter the immersive space to regain the ability to change the enabled state of my entities. My appModel.immersionStyle is inspired by the Compositor-Services demo (although I am using a RealityView) listed in https://developer.apple.com/documentation/CompositorServices/interacting-with-virtual-content-blended-with-passthrough and looks like this: public enum IStyle: String, CaseIterable, Identifiable { case mixedStyle, fullStyle public var id: Self { self } var style: ImmersionStyle { switch self { case .mixedStyle: return .mixed case .fullStyle: return .full } } } /// Maintains app-wide state @MainActor @Observable class AppModel { // Immersion Style public var immersionStyle: IStyle = .mixedStyle
1
0
226
Oct ’25
Unity PolySpatial – Live handheld camera feed of graspable objects not rendering on Vision Pro
I am developing a Unity application for the Apple Vision Pro using PolySpatial and RealityKit integration. The goal is to create a graspable object (for example, a handheld cube) that includes a secondary camera. When the user grabs and moves the object, the secondary camera should render its view to a RenderTexture, which is displayed on a quad attached to the object, simulating a live camera screen. In the Unity Editor, this setup works correctly. The RenderTexture updates in real time, and the quad displays the camera’s view as expected. However, when building and running the application on the Vision Pro, the quad only displays the clear background color of the secondary camera. No scene content appears. The graspable interaction itself works fine: the object can be grabbed and moved as intended. Steps I have taken: Created a new layer (CameraFeed) and assigned the relevant objects to it. Set the secondary camera’s culling mask to render only the CameraFeed layer. Assigned the RenderTexture as the camera’s target texture. Applied the RenderTexture to an Unlit/Texture material on a quad. Confirmed the camera is active and correctly positioned relative to the object. From my research, it appears that once objects are managed by RealityKit through PolySpatial (for example, made graspable), they are no longer rendered through Unity's normal camera pipeline. Only the main XR camera (managed by RealityKit) seems able to see these objects. Secondary Unity cameras cannot render RealityKit-synced content to a RenderTexture. If this is correct, it seems there is currently no way to implement a true live secondary camera feed showing graspable objects on Vision Pro using Unity PolySpatial. My questions are: Is there any official way to enable multiple camera rendering of RealityKit-managed objects through PolySpatial? Are there known workarounds to simulate a live camera feed that still allows objects to be grabbed? Has anyone found alternative design patterns or methods for this kind of interaction? Environment: Unity 6.0 , PolySpatial 2.2.4, Apple Vision OS XR 2.2.4 Any insight or suggestions would be greatly appreciated. Thank you.
0
0
112
Apr ’25
The Question Of Two ARView Together
First, I scan first room using the roomplan api. Because I need scan second room, I stop it by “captureSession.stop(pauseARSession: false)”, I think the Arsession is continue work at that time. Second, before the another room will scan, I want to run another ARView. (in order to detect some objects which are not detected by Roomplan in first room) But, at this time, the second ARView(there is an ARView in roomplan, I think) will always black screen, can’t normally work. This is the question I want to resolve. Please help me let the second ARView go well.
0
0
119
Mar ’25
Metal Compositor Service & Persona (VisionOS)
Hello, I'm currently trying to make a collaborative app. But it just works only on Reality View, when I tried to use Compositor Layer like below, the personas disappeared. ImmersiveSpace(id: "ImmersiveSpace-Metal") { CompositorLayer(configuration: MetalLayerConfiguration()) { layerRenderer in SpatialRenderer_InitAndRun(layerRenderer) } } Is there any potential solution too see Personas in Metal view? Thanks in advance!
2
0
752
Sep ’25
I am developing a Immersive Video App for VisionOs but I got a issue regarding app and video player window
In Vision OS app, We have two types of windows: Main App Window – This is the default window that launches when the app starts. It displays the video listings and other primary content. Immersive Space Window – This opens only when a user starts streaming or playing a video. Issue: When entering the immersive space, the main app window remains visible in front of it unless manually closed. To avoid this, I currently close the main window when transitioning to immersive space and reopen it when exiting. However, this causes the app to restart instead of resuming from its previous state. Desired Behavior: I want the main app window to retain its state and seamlessly resume from where it was before entering immersive mode, rather than restarting. Attempts & Challenges: Tried managing opacity, visibility, and state preservation, but none worked as expected. Couldn’t find a way to push the main window to the background while bringing the immersive space to the foreground. Looking for a solution to keep the main window’s state intact while transitioning between immersive and normal modes.
1
0
134
Mar ’25
[WWDC25] For GuessTogether, can you initiate a FaceTime call via the custom SharePlay button?
Hello, For GuessTogether source code, it seems like the code assumes that you're already in a FaceTime call before pressing the custom SharePlay button (labeled "Play Guess Together"). If not already on a FaceTime call, my Apple Vision Pro and the visionOS simulator both do nothing after throwing warnings. Is this intended behavior? If so, how do I make it so that pressing the button can also initiate FaceTime calls? Is this allowed? Thank you!
3
0
124
Sep ’25
[VisionPro] Placing virtual entities around my arm
Hi everyone, I'm developing a MR Vision Pro app where I’d like to anchor virtual objects (such as UI elements) around the user's arm. However, I’ve noticed that Vision Pro seems to mask out the area where the user’s real arm is, hiding virtual content in that region so that you see your real arm. Is there a way to render virtual elements on the user's arm—so that it looks like the object is placed directly on the arm despite the real-world passthrough? I was hoping there might be a way to adjust the depth or behavior of this masked-out region. Any insights or workarounds would be greatly appreciated! Thanks :)
1
0
97
Mar ’25
Header Blur Effect on visionOS SwiftUI
Hi, I'm looking to build something similar to the header blur in the App Store and Apple TV app settings. Does anyone know the best way to achieve this so that when there is nothing behind the header it looks the same as the rest of the view background but when content goes underneath it has a blur effect. I've seen .scrollEdgeEffect on IOS26 is there something similar for visionOS? Thanks!
0
0
124
Sep ’25
visionOS: Unable to programmatically close child WindowGroup when parent window closes
Hi , I'm struggling with visionOS window management and need help with closing child windows programmatically. App Structure My app has a Main-Sub window hierarchy: AWindow (Home/Main) BWindow (Main feature window) CWindow (Tool window - child of BWindow) Navigation flow: AWindow → BWindow (switch, 1 window on screen) BWindow → CWindow (opens child, 2 windows on screen) I want BWindow and CWindow to be separate movable windows (not sheet/popover) so users can position them independently in space. The Problem CWindow doesn't close when BWindow closes by tapping the X button below the app (next to the window bar) User clicks X on BWindow → BWindow closes but CWindow remains CWindow becomes orphaned on screen Can close CWindow programmatically when switching BWindow back to AWindow App launch issue After closing both windows, CWindow is remembered as last window Reopening app shows only CWindow instead of BWindow User gets stuck in CWindow with no way back to BWindow I've Tried Environment dismissWindow in cleanup but its not working. // In BWindow.swift .onDisappear { if windowManager.isWindowOpen("cWindow") { dismissWindow(id: "cWindow") } } My App Structure Code Now // in MyNameApp.swift @main struct MyNameApp: App { var body: some Scene { WindowGroup(id: "aWindow") { AWindow() } WindowGroup(id: "bWindow") { BWindow() } WindowGroup(id: "cWindow") { CWindow() } } } // WindowStateManager.swift class WindowStateManager: ObservableObject { static let shared = WindowStateManager() @Published private var openWindows: Set<String> = [] @Published private var windowDependencies: [String: String] = [:] private init() {} func markWindowAsOpen(_ id: String) { markWindowAsOpen(id, parent: nil) } func markWindowAsClosed(_ id: String) { openWindows.remove(id) windowDependencies[id] = nil } func isWindowOpen(_ id: String) -> Bool { let isOpen = openWindows.contains(id) return isOpen } func markWindowAsOpen(_ id: String, parent: String? = nil) { openWindows.insert(id) if let parentId = parent { windowDependencies[id] = parentId } } func getParentWindow(of childId: String) -> String? { let parent = windowDependencies[childId] return parent } func getChildWindows(of parentId: String) -> [String] { let children = windowDependencies.compactMap { key, value in value == parentId ? key : nil } return children } func setNextWindowParent(_ parentId: String) { UserDefaults.standard.set(parentId, forKey: "nextWindowParent") } func getAndClearNextWindowParent() -> String? { let parent = UserDefaults.standard.string(forKey: "nextWindowParent") UserDefaults.standard.removeObject(forKey: "nextWindowParent") return parent } func forceCloseChildWindows(of parentId: String) { let children = getChildWindows(of: parentId) for child in children { markWindowAsClosed(child) NotificationCenter.default.post( name: Notification.Name("ForceCloseWindow"), object: nil, userInfo: ["windowId": child] ) forceCloseChildWindows(of: child) } } func hasMainWindowOpen() -> Bool { let mainWindows = ["main", "bWindow"] return mainWindows.contains { isWindowOpen($0) } } func cleanupOrphanWindows() { for (child, parent) in windowDependencies { if isWindowOpen(child) && !isWindowOpen(parent) { NotificationCenter.default.post( name: Notification.Name("ForceCloseWindow"), object: nil, userInfo: ["windowId": child] ) markWindowAsClosed(child) } } } } // BWindow.swift struct BWindow: View { @Environment(\.dismissWindow) private var dismissWindow @ObservedObject private var windowManager = WindowStateManager.shared var body: some View { VStack { Button("Open C Window") { windowManager.setNextWindowParent("bWindow") openWindow(id: "cWindow") } } .onAppear { windowManager.markWindowAsOpen("bWindow") } .onDisappear { windowManager.markWindowAsClosed("bWindow") windowManager.forceCloseChildWindows(of: "bWindow") } .onChange(of: scenePhase) { oldValue, newValue in if newValue == .background || newValue == .inactive { windowManager.forceCloseChildWindows(of: "bWindow") } } } } // CWindow.swift import SwiftUI struct cWindow: View { @ObservedObject private var windowManager = WindowStateManager.shared @State private var shouldClose = false var body: some View { // Content } .onDisappear { windowManager.markWindowAsClosed("cWindow") NotificationCenter.default.removeObserver( self, name: Notification.Name("ForceCloseWindow"), object: nil ) } .onChange(of: scenePhase) { oldValue, newValue in if newValue == .background { } } .onAppear { let parent = windowManager.getAndClearNextWindowParent() windowManager.markWindowAsOpen("cWindow", parent: parent) NotificationCenter.default.addObserver( forName: Notification.Name("ForceCloseWindow"), object: nil, queue: .main) { notification in if let windowId = notification.userInfo?["windowId"] as? String, windowId == "cWindow" { shouldClose = true } } } .onChange(of: shouldClose) { _, newValue in if newValue { dismissWindow() } } } The logs show everything executes correctly, but CWindow remains visible on screen. Questions Why doesn't dismissWindow(id:) work in cleanup scenarios? Is there a proper way to create a window relationships like parent-child relationships in visionOS? How can I ensure main windows open on app launch instead of tool windows? What's the recommended pattern for dependent windows in visionOS? Environment: Xcode 16.2, visionOS 2.0, SwiftUI
2
0
351
Aug ’25
DragGesture that pivots with the user in visionOS
Apple published a set of examples for using system gestures to interact with RealityKit entities. I've been using DragGesture a lot in my apps and noticed an issue when using it in an immersive space. When dragging an entity, if I turn my body to face another direction, the dragged entity does not stay relative to my hand. This can lead to situations where the entity is pulled very close to me, or pushed far way, or even ends up behind me. In the examples linked above, there are two versions of how they use drag. handleFixedDrag: This is similar to what I'm doing now. It uses the value from value.gestureValue.translation3D as the basis for the drag handlePivotDrag: This version aims to solve the problem I described above by using value.inputDevicePose3D as the basis of the gesture. I've tried the example from handlePivotDrag, but it has one limitation. Using this version, I can move the entity around me as if it were on the inside of an arc or sphere. However, I can no longer move the entity further or closer. It stays within a similar (though not exact) distance relative to me while I drag. Is there a way to combine these concepts? Ideally, I would like to use a gesture that behaves the same way that visionOS windows do. When we drag windows, I can move them around relative to myself, pull them closer, push them further, all while avoiding the issues described above. Example from handleFixedDrag mutating private func handleFixedDrag(value: EntityTargetValue<DragGesture.Value>) { let state = EntityGestureState.shared guard let entity = state.targetedEntity else { fatalError("Gesture contained no entity") } if !state.isDragging { state.isDragging = true state.dragStartPosition = entity.scenePosition } let translation3D = value.convert(value.gestureValue.translation3D, from: .local, to: .scene) let offset = SIMD3<Float>(x: Float(translation3D.x), y: Float(translation3D.y), z: Float(translation3D.z)) entity.scenePosition = state.dragStartPosition + offset if let initialOrientation = state.initialOrientation { state.targetedEntity?.setOrientation(initialOrientation, relativeTo: nil) } } Example from handlePivotDrag mutating private func handlePivotDrag(value: EntityTargetValue<DragGesture.Value>) { let state = EntityGestureState.shared guard let entity = state.targetedEntity else { fatalError("Gesture contained no entity") } // The transform that the pivot will be moved to. var targetPivotTransform = Transform() // Set the target pivot transform depending on the input source. if let inputDevicePose = value.inputDevicePose3D { // If there is an input device pose, use it for positioning and rotating the pivot. targetPivotTransform.scale = .one targetPivotTransform.translation = value.convert(inputDevicePose.position, from: .local, to: .scene) targetPivotTransform.rotation = value.convert(AffineTransform3D(rotation: inputDevicePose.rotation), from: .local, to: .scene).rotation } else { // If there is not an input device pose, use the location of the drag for positioning the pivot. targetPivotTransform.translation = value.convert(value.location3D, from: .local, to: .scene) } if !state.isDragging { // If this drag just started, create the pivot entity. let pivotEntity = Entity() guard let parent = entity.parent else { fatalError("Non-root entity is missing a parent.") } // Add the pivot entity into the scene. parent.addChild(pivotEntity) // Move the pivot entity to the target transform. pivotEntity.move(to: targetPivotTransform, relativeTo: nil) // Add the targeted entity as a child of the pivot without changing the targeted entity's world transform. pivotEntity.addChild(entity, preservingWorldTransform: true) // Store the pivot entity. state.pivotEntity = pivotEntity // Indicate that a drag has started. state.isDragging = true } else { // If this drag is ongoing, move the pivot entity to the target transform. // The animation duration smooths the noise in the target transform across frames. state.pivotEntity?.move(to: targetPivotTransform, relativeTo: nil, duration: 0.2) } if preserveOrientationOnPivotDrag, let initialOrientation = state.initialOrientation { state.targetedEntity?.setOrientation(initialOrientation, relativeTo: nil) } }
1
0
518
Feb ’25
Using vision pro to detect distance to real life objects
Is it possible to detect distance from the vision pro to real live objects and people? I tried using scene.raycast to perform a raycast forward from the center of the viewport, but it doesn't seem to react to real life objects, only entities. I see mentioned here: https://developer.apple.com/forums/thread/776807?answerId=829576022#829576022, that a raycast with scene reconstruction should allow me to measure that distance, as long as the object is non-moving. How could I accomplish that?
2
0
173
Apr ’25