Discuss spatial computing on Apple platforms and how to design and build an entirely new universe of apps and games for Apple Vision Pro.

All subtopics
Posts under Spatial Computing topic

Post

Replies

Boosts

Views

Activity

Does Apple Spatial Audio Format documentation exist
The WWDC25 video and notes titled “Learn About Apple Immersive Video Technologies” introduced the Apple Spatial Audio Format (ASAF) and codec (APAC). However, despite references throughout on using immersive video, there is scant information on ASAF/APAC (including no code examples and no framework references), and I’ve found no documentation in Apple’s APIs/Frameworks about its implementation and use months on. I want to leverage ambisonic audio in my app. I don’t want to write a custom AU if APAC will be opened up to developers. If you read the notes below along with the iPhone 17 advertising (“Video is captured with Spatial Audio for immersive listening”), it sounds like this is very much a live feature in iOS26. Anyone know the state of play? I’m across how the PHASE engine works, which is unrelated to what I’m asking about here. Original quote from video referenced above: “ASAF enables truly externalized audio experiences by ensuring acoustic cues are used to render the audio. It’s composed of new metadata coupled with linear PCM, and a powerful new spatial renderer that’s built into Apple platforms. It produces high resolution Spatial Audio through numerous point sources and high resolution sound scenes, or higher order ambisonics.” ”ASAF is carried inside of broadcast Wave files with linear PCM signals and metadata. You typically use ASAF in production, and to stream ASAF audio, you will need to encode that audio as an mp4 APAC file.” ”APAC efficiently distributes ASAF, and APAC is required for any Apple immersive video experience. APAC playback is available on all Apple platforms except watchOS, and supports Channels, Objects, Higher Order Ambisonics, Dialogue, Binaural audio, interactive elements, as well as provisioning for extendable metadata.”
4
0
629
Sep ’25
Slow Auto Focus on iPhone 16 Pro with ARkit camera
I have recently started testing ARKit on an iPhone 16 Pro and I have noticed that the AutoFocus reaction on this device is much slower than other devices. For example, if I point the camera to a close object AutoFocus takes 4-5 seconds to stabilize, the focal length is adjusted very very slowly. In some cases (although this is rare) AutoFocus seems almost stuck and requires a bit of device movement to trigger. This is quite problematic when using some ARKit features like Image and Object detection as the detection algorithms struggle with out-of-focus images. This problem is limited to ARKit. AutoFocus is significantly more responsive when the standard AVFoundation Camera API is used. This behavior is easy to reproduce with any of the ARKit samples like https://developer.apple.com/documentation/arkit/arkit_in_ios/content_anchors/tracking_and_visualizing_planes Is anybody else experiencing this problem?
4
1
810
Jan ’25
Metal (Compositor Services) or RealityKit on visionOS
I am develop visionOS app. I am now very interested in Metal and Compositor Services, but I have not explored them in depth. I know that Metal has a higher degree of control freedom. I am wondering if using Compositor Services will have fewer functions than RealityKit in AR technology (such as scene reconstruction and understanding, hover effect, etc.).
4
0
203
Jun ’25
Issue with tracking multiple images using ARKit on VisionOS
We are using the ARKit image tracking feature on visionOS 2.0 with three pre-registered images. The image tracking works, but only one image is actively tracked at a time. When more than one target image is visible to the camera, it has difficulty detecting and tracking the other images. Is this the expected behavior in visionOS, or is there something we need to do to resolve this issue?
4
1
480
Mar ’25
PSVR2 controller button quirks
I have an open Feedback conversation with Apple on this topic, but I am curious if others have run into this, or want to try out my sample code in their set up. there are two API’s for reading controller buttons, axis, and D pads: GCPhysicalInputProfile and GCControllerLiveInput. There are inconsistencies in behaviour between the two of them. Apple recommends we use GCControllerLiveInput, however, there are some capabilities on these controllers that are only accessible through GCPhysicalInputProfile, as I’ll discuss below. PSVR2 R2/L2 buttons, a.k.a. triggers, have force input analogue values. These can only be accessed on GCPhysicalInputProfile PSVR2 thumbstick direction values are read through “axes” on GCPhysicalInputProfile, but only “dpads” on GCControllerLiveInput on both GCPhysicalInputProfile and GCControllerLiveInput, All pressed events of all buttons are fired properly using generic aliases ( Trigger, Grip ,Menu, Right Thumbstick, Left Thumbstick, Right Button A & B (Circle & Cross), Left Button A&B (Triangle and Square) ). Apple reserves the system button as the equivalent of a home button for the OS. on GCPhysicalInputProfile, touch events are fired when the button is also pressed, but not for only touches. on GCControllerLiveInput , Touch events only works for the following buttons: Left Thumbstick, Right Thumbstick, Right Button A (Circle), and Right Button B (Cross). But Right Button B touch event isn’t labelled correctly, it fires as the Right Button A event. I observed this inside ALVR which uses a polling based approach to event processing: https://github.com/alvr-org/alvr-visionos/blob/17b5968f9d894944b53e97134b39dfce0993302a/ALVRClient/WorldTracker.swift#L301 To simplify to see this on a very simple app, I used the Apple example TrackingAccessories application: https://developer.apple.com/documentation/ARKit/tracking-accessories-in-volumetric-windows I’ve attached the code that replaces the AccessoryTrackingModel class. I added code that prints out what is touched/pressed, see the trackAllConnectedSpatialControllers method: https://github.com/svrc/TrackingAccessories
4
0
571
1w
Access Main Camera not working in VisionOS 26.1
I downloaded the official sample project “Accessing the Main Camera”, but I found that it’s not able to retrieve the camera feed on visionOS 26.1. After checking the debug logs, it seems the issue is caused by the system being unable to find the expected format. I tested on a device running visionOS 2, and the camera feed worked correctly — but only when using the sample code from the visionOS 2 version, not the current one. I also noticed that some of the APIs have changed between versions. Has anyone managed to successfully access the camera feed on visionOS 26.1?
4
0
696
5d
Animations exported from Blender does not shoe in Reality Composer Pro
I made an animation in Blender using geometry nodes that I exported to USDC file (then I used Reality Converter to convert to USDZ) and I can see the animation when viewing from the finder but does not play after importing to RCP. Any idea how I can play the animation? Or can the animation be accessed through Xcode? Thanks!
4
0
1k
Apr ’25
What causes »ARSessionDelegate is retaining X ARFrames« console warning?
Hi, since iOS 15 I've repeatedly noticed the console warning »ARSessionDelegate is retaining X ARFrames. This can lead to future camera frames being dropped« even for rather simple projects using RealityKit and ARKit. Could someone from the ARKit team please elaborate what causes this warning and what can be done to avoid it? If I remember correctly I didn't even assign an ARSessionDelegate. Thank you!
4
1
3.3k
Feb ’25
Xcode 26 - extremely long time to open immersive space
The issue reproducible with empty project. When you run it and tap "Open immersive space" it takes a couple of minutes to respond. The issue only reproducible on real device with debugger attached. Reproducible other developers too (not specific to my environment). Issue doesn't exists in Xcode 16. Afer initial long delay subsequent opens works fine. Console logs: nw_socket_copy_info [C1:2] getsockopt TCP_INFO failed [102: Operation not supported on socket] nw_socket_copy_info getsockopt TCP_INFO failed [102: Operation not supported on socket] Failed to set dependencies on asset 9303749952624825765 because NetworkAssetManager does not have an asset entity for that id. void * _Nullable NSMapGet(NSMapTable * _Nonnull, const void * _Nullable): map table argument is NULL PSO compilation completed for driver shader copyFromBufferToTexture so=0 sbpr=256 sbpi=16384 ss=(64, 64, 1) p=70 sc=1 ds=0 dl=0 do=(0, 0, 0) in 1997 XPC connection interrupted <<<< FigAudioSession(AV) >>>> audioSessionAVAudioSession_CopyMXSessionProperty signalled err=-19224 (kFigAudioSessionError_UnsupportedOperation) (getMXSessionProperty unsupported) at FigAudioSession_AVAudioSession.m:606 Failed to load item AXCodeItem<0x14706f250> [Rank:6000] SpringBoardUIServices [AXBundle name:/System/Library/AccessibilityBundles/SpringBoardUIServices.axbundle/SpringBoardUIServices] [Platforms and Targets:{ iOS = SpringBoardUIServices; } Framework] [Excluded: (null)]. error: Error Domain=AXLoading Code=0 "URL does not exist: file:///System/Library/AccessibilityBundles/SpringBoardUIServices.axbundle" UserInfo={NSLocalizedDescription=URL does not exist: file:///System/Library/AccessibilityBundles/SpringBoardUIServices.axbundle} Failed to load item AXCodeItem<0x14706f250> [Rank:6000] SpringBoardUIServices [AXBundle name:/System/Library/AccessibilityBundles/SpringBoardUIServices.axbundle/SpringBoardUIServices] [Platforms and Targets:{ iOS = SpringBoardUIServices; } Framework] [Excluded: (null)]. error: Error Domain=AXLoading Code=0 "URL does not exist: file:///System/Library/AccessibilityBundles/SpringBoardUIServices.axbundle" UserInfo={NSLocalizedDescription=URL does not exist: file:///System/Library/AccessibilityBundles/SpringBoardUIServices.axbundle} [b30780-MRUIFeedbackTypeButtonWithBackgroundTouchDown] Playback timed out before completion (after 3111 ms) Failed to set dependencies on asset 7089614247973236977 because NetworkAssetManager does not have an asset entity for that id.
4
2
379
Oct ’25
VisionOS ARKit CameraFrame Sample Parameters Extrinsics
the following documentation tells me that the CameraFrame.Sample.Parameters.extrinsics is of type simd_float4x4, great! https://developer.apple.com/documentation/arkit/cameraframe/sample/parameters/4443449-extrinsics I have read in the answer of another post that this extrinsics represents the pose of the physical camera relative to the device anchor. Did I understand correctly that the device anchor is where the scene is rendered from onto the user's display? What is the coordinate system in which this offset is defined, which axis is left, which one is up, which one is forward? The last column of the extrinsics seems to define a translation of approximately 2 cm along the x axis, -2cm along the y axis and -5 cm along the z axis. I tried to measure the physical distance between the main left and right cameras in order to find out if it's rather 2cm or 5 cm from the "middle", it looks more like 5, so I assume that the z axis is looking towards the right (from the user's perspective). Is that so? For x and y, I assume that the physical camera is approximately 2 cm to the front of the user and 2cm to the bottom, which of x and y is horizontal, which on vertical? How is the camera image indexed, is it row-major and is the origin on the top left? I am looking forward to learning about all the details on these extrinsics in order to make use of it.
4
0
819
Jan ’25
Blender to Reality Composer Pro 2.0 to SwiftUI + RealityKit visionOS Best Practices
Hi, I'm very new to 3D and am currently porting a SwiftUI iOS app to visionOS 2.0. I saw WWDC24 feature Blender in multiple spatial videos, and have begun integrating Blender models and animations into my VisionOS app (I would also like to integrate skeletons and programmatic rigging, more on that later). I'm wondering if there are “Best Practices” for this workflow - from Blender to USD to RCP 2.0 to visionOS 2 in Xcode. I’ve cobbled together the following that has some obvious holes: I’ve been able to find some pre-rigged and pre-animated models online that can serve as a great starting point. As a reference, here is a free model from SketchFab - a simple rigged skeleton with 6 built in animations: https://sketchfab.com/3d-models/skeleton-character-low-poly-8856e0138f424d68a8e0b40e185951f6 When exporting to USD from Blender, I haven’t been able to export more than one animation per USD file. Is there a workflow to export multiple animations in a single USDC file, or is this just not possible? As a temporary workaround, here is a python script I’ve been using to loop through all Blender animations, and export a model for each animation: import bpy import os # Set the directory where you want to save the USD files output_directory = “/path/to/export” # Ensure the directory exists if not os.path.exists(output_directory): os.makedirs(output_directory) # Function to export current scene as USD def export_scene_as_usd(output_path, start_frame, end_frame): bpy.context.scene.frame_start = start_frame bpy.context.scene.frame_end = end_frame # Export the scene as a USD file bpy.ops.wm.usd_export( filepath=output_path, export_animation=True ) # Save the current scene name original_scene = bpy.context.scene.name # Iterate through each action and export it as a USD file for action in bpy.data.actions: # Create a new scene for each action bpy.context.window.scene = bpy.data.scenes[original_scene].copy() new_scene = bpy.context.scene # Link the action to all relevant objects for obj in new_scene.objects: if obj.animation_data is not None: obj.animation_data.action = action # Determine the frame range for the action start_frame, end_frame = action.frame_range # Export the scene as a USD file output_path = os.path.join(output_directory, f"{action.name}.usdc") export_scene_as_usd(output_path, int(start_frame), int(end_frame)) # Delete the temporary scene to free memory bpy.data.scenes.remove(new_scene) print("Export completed.") I have also been able to successfully export rigging armatures as a single Skeleton - each “bone” showing getting imported into Reality Composer Pro 2.0 when exporting/importing manually. I would like to have all of these animations available in a single scene to be used in a RealityView in visionOS - so I have placed all animation models in a RCP scene and created named Timeline Action animations for each, showing the correct model and hiding the rest when triggering specific animations. I apply materials/textures to each so they appear the same, using Shader Graph. Then in SwiftUI I use notifications (as shown here - https://forums.developer.apple.com/forums/thread/756978) to trigger each RCP Timeline Action animation from code. Two questions: Is there a better way than to have multiple models of the same skeleton - each with a different animation - in a scene to be able to trigger multiple animations? Or would this require recreating Blender animations using skeleton rigging and keyframes from within RCP Timelines? If I want to programmatically create custom animations and move parts of the skeleton/armatures - do I need to do this by defining custom components in RCP, using IKRig and define movement of each of the “bones” in Xcode? I’m looking for any tips/tricks/workflow from experienced engineers or 3D artists that can create a more efficient/optimized workflow using Blender, USD, RCP 2 and visionOS 2 with SwiftUI. Thanks so much, I appreciate any help! I am very excited about all the new tools that keep evolving to make spatial apps really fun to build!
4
2
1.1k
Apr ’25
When using ARKit, why can’t you get the front-facing and back-facing camera feeds at once?
I’d like to use ARKit world tracking and display both the back camera feed and the front camera feeds, using the front feed as as a PIP. This would work great for an internet streaming use case. However, it’s impossible. As soon as ARKit is told to use one mode, the camera for the other side freezes/doesn’t work. This page also says you have to pick one camera to show: https://developer.apple.com/documentation/arkit/arkit_in_ios/choosing_which_camera_feed_to_augment?language=objc A question to the developers: why is this limitation in-place? Are there any work-arounds for the use case of ARKit world tracking + displaying the back camera feed + displaying the front camera feed as an overlay? It’s possible to do this with plain camera initialization without ARKit. (There’s an official example.) With ARKit, it no longer works. It’s strange that I cannot access the front feed via one of the other frameworks, but I guess that ARKit blocks that.
3
0
1.1k
Dec ’24
RealityView and Persistent World Data?
I was watching the Developer videos, and there was mention that RealityView handles persistent world data differently and also automatically for us. I am having an issue finding the material I need to get up to speed on that. In ARKit, I was able to place a model with the world data and recall that .map data. It even stored a reference image for the scene to help match the world data. I'm looking for the information on how to implement and work with those same features with RealityView, as it seems to be better/automatically integrated? I need help being pointed in the right direction. Sample code would be amazing.
3
0
554
Feb ’25
[WWDC25] For GuessTogether, can you initiate a FaceTime call via the custom SharePlay button?
Hello, For GuessTogether source code, it seems like the code assumes that you're already in a FaceTime call before pressing the custom SharePlay button (labeled "Play Guess Together"). If not already on a FaceTime call, my Apple Vision Pro and the visionOS simulator both do nothing after throwing warnings. Is this intended behavior? If so, how do I make it so that pressing the button can also initiate FaceTime calls? Is this allowed? Thank you!
3
0
111
Sep ’25
RotateGesture3D auto constrained to axis
Hi, On visionOS to manage entity rotation we can rely on RotateGesture3D. We can even with the constrainedToAxis parameter authorize only rotation on an x, y or z axis or even make combinations. What I want to know is if it is possible to constrain the rotation on axis automatically. Let me explain, the functionality that I would like to implement is to constrain the rotation on an axis only once the user has started his gesture. The initial gesture the user makes should let us know which axis they want to rotate on. This would be equivalent to activating a constraint automatically on one of the axes, as if we were defining the gesture on one of the axes. RotateGesture3D(constrainedToAxis: .x) RotateGesture3D(constrainedToAxis: .y) RotateGesture3D(constrainedToAxis: .z) Is it possible to do this? If so, what would be the best way to do it? A code example would be greatly appreciated. Regards Tof
3
0
411
Feb ’25