The simplest realityView (content, attachments in ...
causes Contextual closure expects 1 argument but 2 were used in closure body. I have checked every example and i cannot understand why i get this error regardless of any content. Note: i have added Attachment(id: "test") to the attachment closure and get Attachment not is scope.
imported both realityKit and SwiftUI.
ARKit
RSS for tagIntegrate iOS device camera and motion features to produce augmented reality experiences in your app or game using ARKit.
Posts under ARKit tag
140 Posts
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
I am trying the simplest use of attachment in realityKit and get Contextual closure type @MainActor, @Sendable (inout RealityViewCameraContent) async -> void expects 1 argument, but 2 were used in closure body.
Also i get cannot find Attachment in scope
I have an app on the App Store for many years enabling users to post text into clouds in augmented reality. Yet last week abruptly upon installing the app on the iPhone the screen started going totally dark and a list of little comprehensible logs came up of the kind:
ARSCNCompositor <0x300ad0e00>: ARSCNCompositor (0, 0) initialization failed. Matting is not set up properly.
many times, then
RWorldTrackingTechnique <0x106235180>: Unable to update pose [PredictorFailure] for timestamp 870.392108
ARWorldTrackingTechnique <0x106235180>: Unable to predict pose [1] for timestamp 870.392108
again several times and then:
ARWorldTrackingTechnique <0x106235180>: SLAM error callback: Error Domain=Slam Error Code=7 "Non fatal error occurred due to significant drop in a IMU data" UserInfo={NSDescription=Non fatal error occurred due to significant drop in a IMU data, NSLocalizedFailureReason=SlamEngineNodeGroup Failure: IMU issue: gyro data stream verification failed [Significant data drop]. Failed on timestamp: 870.413247, Last known timestamp: 865.350198, Delta: 5.063049, System timestamp: 870.415781, Delta between system and frame: 0.002534. }
and then again the pose issues several times.
I hoped the new beta version would have solved the issue, but it was not the case. Unfortunately I do not know if that depends on the beta version or some other issue, given the app may be not installed on the Mac simulator.
Hi everyone,
I’ve been analyzing the current state of Sign Language accessibility tools, and I noticed a significant gap in learning tools: we lack real-time feedback for students (e.g., "Is my hand position correct?").
Most current solutions rely on 2D video processing, which struggles with depth perception and occlusion (hand-over-hand or hand-over-face gestures), which are critical in Sign Language grammar.
I'd like to propose/discuss an architecture leveraging the current LiDAR + Neural Engine capabilities found in iPhone devices to solve this.
The Concept: Skeleton-based Normalization
Instead of training ML models on raw video frames (which introduces noise from lighting, skin tone, and clothing), we could use ARKit's Body Tracking to abstract the input.
Capture: Use ARKit/LiDAR to track the user's upper body and hand joints in 3D space.
Data Normalization: Extract only the vector coordinates (X, Y, Z of joints). This creates a "clean" dataset, effectively normalizing the user regardless of physical appearance.
Comparison: Feed these vectors into a CoreML model trained on "Reference Skeletons" (recorded by native signers).
Feedback Loop: The app calculates the geometric distance between the user's pose and the reference pose to provide specific correction (e.g., "Raise your elbow 10 degrees").
Why this approach?
Solves Occlusion: LiDAR handles depth much better than standard RGB cameras when hands cross the body.
Privacy: We are processing coordinates, not video streams.
Efficiency: Comparing vector sequences is computationally cheaper than video analysis, preserving battery life.
Has anyone experimented with using ARKit Body Anchors specifically for comparing complex gesture sequences against a stored "correct" database? I believe this "Skeleton First" approach is the key to scalable Sign Language education apps.
Looking forward to hearing your thoughts.
I'm capturing a room via RoomPlan API and would like to access the DepthMap(sceneDepth) or SmoothDepthMap(smoothedSceneDepth) from my own provided ARSession for RoomCaptureSession.
But both depth maps are empty when handling the delegates. I have not found a solution yet. So is it even possible? Because i have not found any documentation of what RoomCaptureSession overwrites in the ARSession if I provide my own ARSession instance.
Here is a example code snippet of what i'm trying to do:
private let arSession = ARSession()
private lazy var roomPlanCaptureSession = RoomCaptureSession(arSession: arSession)
let arConfig = ARWorldTrackingConfiguration()
//Create semantics for ARconfig which is used for ARSession
var semantics: ARWorldTrackingConfiguration.FrameSemantics = []
if ARWorldTrackingConfiguration.supportsFrameSemantics(.sceneDepth) {
semantics.insert(.sceneDepth)
}
if ARWorldTrackingConfiguration.supportsFrameSemantics(.smoothedSceneDepth) {
semantics.insert(.smoothedSceneDepth)
}
arConfig.frameSemantics = semantics
//set delegates
roomPlanCaptureSession.delegate = self
arSession.delegate = self
//Check if device support for depthMap
if ARWorldTrackingConfiguration.supportsFrameSemantics(.sceneDepth){
arSession.run(arConfig)
}
else{
print(".sceneDepth is unsupported.")
}
//run roomcapture scan config
let captureConfig = RoomCaptureSession.Configuration()
roomPlanCaptureSession.run(configuration: captureConfig)
//trying to get sceneDepth
public func session(_ session: ARSession, didUpdate frame: ARFrame) {
print("session delegate capture: sceneDepth: \(String(describing: frame.sceneDepth))")
//prints: session delegate capture: sceneDepth: nil
also in this video from 2023 it is say that i can pass custom ARSession to my RoomPlan.
Explore enhancements to RoomPlan - Video
Quote 3:00: Here is the init and stop function in previous RoomPlan. And here is how you pass over a custom ARSession to init function. Any custom ARSession with ARWorldTrackingConfiguration will be honored inside RoomCaptureSession.
anyway I welcome any input. maybe im doing something wrong. :)
This is using:
**Unity 6000.2.7f - latest ARKit etc
Xcode 26.1 **
This is going to iPad/iPhone/iOS and I have succesfully built to the iPad before, but updates to the software seem to have added these errors in.
Basically all updated as much as possible.
Originally I got 5 errors, but updates brought it down.
On build I now get 3 undefined symbols:
Undefined symbol: _swift_FORCE_LOAD$_swiftCompatibility51
Undefined symbol: _swift_FORCE_LOAD$_swiftCompatibility56
Undefined symbol: _swift_FORCE_LOAD$_swiftCompatibilityConcurrency
This appears to be a bug, or issue, that is in some way known, but I'm not sure how best to get past it.
ld: warning: search path '/var/run/com.apple.security.cryptexd/mnt/com.apple.MobileAsset.MetalToolchain-v17.1.324.0.kGuqPt/Metal.xctoolchain/usr/lib/swift/iphoneos' not found
ld: warning: search path '/var/run/com.apple.security.cryptexd/mnt/com.apple.MobileAsset.MetalToolchain-v17.1.324.0.kGuqPt/Metal.xctoolchain/usr/lib/swift-5.0/iphoneos' not found
ld: warning: Could not find or use auto-linked library 'swiftCompatibility51': library 'swiftCompatibility51' not found
ld: warning: Could not find or use auto-linked library 'swiftCompatibility56': library 'swiftCompatibility56' not found
ld: warning: Could not find or use auto-linked library 'swiftCompatibilityConcurrency': library 'swiftCompatibilityConcurrency' not found
ld: warning: Could not find or use auto-linked library 'swiftCompatibilityPacks': library 'swiftCompatibilityPacks' not found
ld: warning: Could not find or use auto-linked framework 'CoreAudioTypes': framework 'CoreAudioTypes' not found
ld: warning: Could not find or use auto-linked framework 'UIUtilities': framework 'UIUtilities' not found
ld: warning: Could not parse or use implicit file '/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS.sdk/System/Library/Frameworks/SwiftUICore.framework/SwiftUICore.tbd': cannot link directly with 'SwiftUICore' because product being built is not an allowed client of it
Undefined symbols for architecture arm64:
"_swift_FORCE_LOAD$_swiftCompatibility51", referenced from:
_swift_FORCE_LOAD$swiftCompatibility51$_UnityARKit in libUnityARKit.a[49](RoomCaptureSessionWrapper. o)
"_swift_FORCE_LOAD$_swiftCompatibility56", referenced from:
_swift_FORCE_LOAD$swiftCompatibility56$_UnityARKit in libUnityARKit.a[49](RoomCaptureSessionWrapper. o)
"_swift_FORCE_LOAD$_swiftCompatibilityConcurrency", referenced from:
_swift_FORCE_LOAD$swiftCompatibilityConcurrency$_UnityARKit in libUnityARKit.a[49](RoomCaptureSessionWrapper. o)
ld: symbol(s) not found for architecture arm64
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
Best approach for high-quality textured room reconstruction using ARKit / RoomPlan / Object Capture?
I am developing an IOS App that allow users to scan rooms, view the scans on device, and add notes. I need to preserve actual geometry (odd angles, chamfers, fixtures), not simplified RoomPlan boxes.
Are there any easy ways to incorporate high quality texture mapping or PBR? Where is the documentation for scene reconstruction?
Hi team,
I believe I’ve found a registration issue between ARFrame.sceneDepth and ARFrame.capturedImage when using high-resolution frame capture on a 2022 iPad Pro (6th gen).
When enabling high-resolution capture:
if let highResFormat = ARWorldTrackingConfiguration.recommendedVideoFormatForHighResolutionFrameCapturing {
config.videoFormat = highResFormat
}
…
arView.session.captureHighResolutionFrame { ... }
the depth map provided by ARFrame.sceneDepth no longer aligns correctly with the corresponding high-resolution capturedImage.
This misalignment results in consistently over-estimated distance measurements in my app (which relies on mapping depth to 2D pixel coordinates).
iPad Pro (6th gen): misalignment occurs only when capturing high-resolution frames.
iPhone 16 Pro: depth is correctly registered for both standard and high-resolution captures.
It appears the camera intrinsics, specifically the FOV, change between the “regular” resolution stream and the high-resolution capture on the iPad. My suspicion is that the depth data continues using the intrinsics of the lower resolution stream, resulting in an unregistered depth-to-RGB mapping.
Once I have the iPad in hand again, I will confirm whether camera.intrinsics or FOV differ between the low-res and high-res frames.
Is this a known issue with high-resolution frame capture on the 2022 iPad Pro? If not, I’m happy to provide some more thorough sample code.
Thanks for your time!
I'm developing a custom gesture-based visionOS project that uses hand tracking with collision detection spheres on fingers to register user interactions through collision components. I'm experiencing a critical occlusion issue where collision detection spheres are intermittently occluded by the background/depth buffer, causing fingers to pass through the 3D model entities without registering interactions.
Detailed Description:
I have added 3D entities in an immersive scene with collision spheres attached to fingers for detecting user interactions.
Each sphere has:
CollisionComponent with sphere shape
Proper collision masks and groups configured
Real-time position updates from hand joint transforms
Each entity has:
InputTarget components to register collisions
The Issue:
When users move their fingers to the entity to interact, some collision spheres (particularly on the pinkie and ring fingers) become occluded and pass directly through the 3D model without triggering collision events.
Meanwhile, other fingers (like the index finger) continue to work correctly.
This appears to be a depth perception/z-buffer issue between the model entity and the hand tracking collision spheres
Questions:
Is there a recommended approach for maintaining consistent depth ordering between hand-tracking entities and 3D models in immersive spaces to prevent occlusion issues?
Should I be using AnchorEntities to anchor the entity to a plane or world position to establish a more stable depth reference?
Are there specific RenderingComponent or material settings that could help ensure collision entities maintain their depth priority and don't get occluded?
Could this be related to z-fighting when collision spheres and entity geometry occupy similar depth ranges? If so, what's the recommended depth bias approach?
Is there a better architectural approach for implementing interactions with custom hand gesture tracking that avoids these depth perception issues?
What Would Help:
Implementation guidance for ensuring reliable collision detection between hand-tracked entities through custom gestures and 3D models.
Best practices for depth management in immersive spaces with custom hand gesture tracking.
Sample code demonstrating stable hand-to-object interaction patterns.
Information about whether this is a known limitation or if there are specific APIs I should be leveraging
This issue is significantly impacting the reliability of our app experience, as users cannot consistently interact with all model components. Any guidance from Apple engineers or developers who have solved similar depth/occlusion challenges would be greatly appreciated.
Additional Context:
This is for a productivity-focused application where accuracy and reliability are critical.
Thank you for any assistance!
Topic:
Spatial Computing
SubTopic:
Reality Composer Pro
Tags:
ARKit
Reality Composer
AR / VR
visionOS
I use ARKit for motion tracking. I get the skeleton joint coordinates and use them for animation. I didn't make any changes to the code, but I updated the iOS version from 18 to 26, and modelTransform now always returns nil.
https://developer.apple.com/documentation/arkit/arskeleton3d/modeltransform(for:)
For example
bodyAnchor.skeleton.modelTransform(for: .init(rawValue: "head_joint"))
bodyAnchor is ARBodyAnchor.
I see the default skeleton on the screen, but now I can't get the coordinates out of it.
I'm using an example from Apple's WWDC presentation.
https://developer.apple.com/documentation/arkit/capturing-body-motion-in-3d
Are there any changes in the API? Or just bug?
On iOS 26.1, this throws on the 2020 iPad Pro (4th gen) but works fine on an M4 iPad Pro or iPhone 15 Pro:
guard let device = AVCaptureDevice.default(.builtInLiDARDepthCamera, for: .video, position: .back) else {
throw ConfigurationError.lidarDeviceUnavailable
}
It's just the standard code from Apple's own sample code so obviously used to work:
https://developer.apple.com/documentation/AVFoundation/capturing-depth-using-the-lidar-camera
Does it fail because Apple have silently dumped support for the older LiDAR sensor used prior to the M4 iPad Pro, or is there another reason? What about the 5th and 6th gen iPad Pro, does it still work on those?
Hi everyone! I am working on AR app and wanted to implement object occlusion because it removes drift pretty much from the object. This working great with RealityKit sample But I am unable to replicate such behaviour it with scenekit. Because scenekit does not offer object occlusion. Can we say scenekit is getting depricated, and we should re-write app in RealityKit (which is obviously a big task)?
I use ARKit's hand tracking to attach a 3D model of a remote control to the left hand. The user is supposed to press buttons on the remote control. In the Vision Pro settings, I have removed the left hand from Hands & Eye Tracking. Only the right hand is used. The problem now is that the left hand appears and the 3D model of the remote control fades out. I want the remote control to be completely visible. The user should feel like they really have the remote control in their hand. Can I prevent the fading out?
While using apple's vision pro, we noticed that we can continue to use the visionOS keyboard when we no longer actually see it in passthrough.
In other words, when we focus on a field to type, visionOS displays the keyboard for us in such a way that we actually see it. Then, we noticed if we look away a little bit, either up, or down, or left, or right, in such a way that the keyboard is no longer visible by us in the passthrough, the keyboard still remains responsive to taps from our fingers at the location where it is. It seems the keyboard remains functional and responsive to taps even though we can no longer observe/see it.
We are trying to figure out how to implement similar functionality in our app whereby the user can continue to manipulate a 3d entity when the user can no longer actually observe it in passthrough (like the visionOS keyboard appears to allow).
I assume the visionOS keyboard has this functionality thanks to the downward facing sensors on the hardware that allow hand tracking even though the hands can no longer be observed by the user. That is likely how we can rest our hands on our lap is still be able to interact with visionOS.
How can we implement a similar functionality for 3D entities?
Is there a way to tap in, or to allow hand tracking, from those toward facing cameras?
Is it possible to manipulate a 3D entity when it is no longer observed by the user for example when they shift their attention somewhere else in the field of vision?
How does the visionOS keyboard achieve this?
Problem Description
(1) I am using ARKit in an iOS app to provide AR capabilities. Specifically, I'm trying to use the ARSession's captureHighResolutionFrame(using:) method to capture a high-resolution frame along with its corresponding depth data:
open func captureHighResolutionFrame(using photoSettings: AVCapturePhotoSettings?) async throws -> ARFrame
(2) However, when I attempt to do so, the call fails at runtime with the following error, which I captured from the Xcode debugger:
[AVCapturePhotoOutput capturePhotoWithSettings:delegate:] settings.depthDataDeliveryEnabled must be NO if self.isDepthDataDeliveryEnabled is NO
Code Snippet Explanation
(1) ARConfig and ARSession Initialization
The following code configures the ARConfiguration and ARSession. A key part of this setup is setting the videoFormat to the one recommended for high-resolution frame capturing, as suggested by the documentation.
func start(imagesDirectory: URL, configuration: Configuration = Configuration()) {
// ... basic setup ...
let arConfig = ARWorldTrackingConfiguration()
arConfig.planeDetection = [.horizontal, .vertical]
// Enable various frame semantics for depth and segmentation
if ARWorldTrackingConfiguration.supportsFrameSemantics(.smoothedSceneDepth) {
arConfig.frameSemantics.insert(.smoothedSceneDepth)
}
if ARWorldTrackingConfiguration.supportsFrameSemantics(.sceneDepth) {
arConfig.frameSemantics.insert(.sceneDepth)
}
if ARWorldTrackingConfiguration.supportsFrameSemantics(.personSegmentationWithDepth) {
arConfig.frameSemantics.insert(.personSegmentationWithDepth)
}
// Set the recommended video format for high-resolution captures
if let videoFormat = ARWorldTrackingConfiguration.recommendedVideoFormatForHighResolutionFrameCapturing {
arConfig.videoFormat = videoFormat
print("Enabled: High-Resolution Frame Capturing by selecting recommended video format.")
}
arSession.run(arConfig, options: [.resetTracking, .removeExistingAnchors])
// ...
}
(2) Capturing the High-Resolution Frame
The code below is intended to manually trigger the capture of a high-resolution frame. The goal is to obtain both a high-resolution color image and its associated high-resolution depth data. To achieve this, I explicitly set the isDepthDataDeliveryEnabled property of the AVCapturePhotoSettings object to true.
func requestImageCapture() async {
// ... guard statements ...
print("Manual image capture requested.")
if #available(iOS 16.0, *) { // Assuming 16.0+ for this API
if let defaultSettings = arSession.configuration?.videoFormat.defaultPhotoSettings {
// Create a mutable copy from the default settings, as recommended
let photoSettings = AVCapturePhotoSettings(from: defaultSettings)
// Explicitly enable depth data delivery for this capture request
photoSettings.isDepthDataDeliveryEnabled = true
do {
let highResFrame = try await arSession.captureHighResolutionFrame(using: photoSettings)
print("Successfully captured a high-resolution frame.")
if let initialDepthData = highResFrame.capturedDepthData {
// Process depth data...
} else {
print("High-resolution frame was captured, but it contains no depth data.")
}
} catch {
// The exception is caught here
print("Error capturing high-resolution frame: \(error.localizedDescription)")
}
}
}
// ...
}
Issue Confirmation & Question
(1) Through debugging, I have confirmed the following behavior: If I call captureHighResolutionFrame without providing the photoSettings parameter, or if photoSettings.isDepthDataDeliveryEnabled is set to false, the method successfully returns a high-resolution ARFrame, but its capturedDepthData is nil.
(2) The error message clearly indicates that settings.depthDataDeliveryEnabled can only be true if the underlying AVCapturePhotoOutput instance's own isDepthDataDeliveryEnabled property is also true.
(3) However, within the context of ARKit and ARSession, I cannot find any public API that would allow me to explicitly access and configure the underlying AVCapturePhotoOutput instance that ARSession manages.
(4) My question is:
Is there a way to configure the ARSession's internal AVCapturePhotoOutput to enable its isDepthDataDeliveryEnabled property? Or, is simultaneously capturing a high-resolution frame and its associated depth data simply not a supported use case in the current ARKit framework?
I have a visionOS app where I instantiate ARKitSession and various providers (HandTrackingProvider and WorldTrackingProvider) in my appModel. That way, I can pass these providers to a Task which runs a gRPC server for sending the data from these providers to a client. When the users enters the immersive space of the app, the ARKitSession will run the providers if they are not running already.
I am now trying to implement the AccessoryTrackingProvider with the PSVR sense controllers but it does not fit with my current framework because the controllers may not be connected when the ARKitSession.run function is called. So I need to find a new place to start the session.
My question is, if I already have a session which is running the hand and world tracking providers, can I start another session to run the accessory tracking? Should they all be running on the same session?
Is there a way to stop the session and restart it when the controllers are connected? When I tried this, I get an error that says "It is not possible to re-run a stopped data provider (<ar_hand_tracking_provider_t: " but if I instantiate a new HandTrackingProvider, then the one that got passed to the gRPC task would no longer be the one running in the new session.
Any advice on how best to manage the various providers and ARKit sessions would be greatly appreciated.
I want to let users place 2D/3D “artworks” on detected walls and have them reappear in exactly the same real‑world spot after quitting and relaunching the app (like widgets do, but for my own entities).Environment: Xcode 26, visionOS 2.0, RealityKit + ARKitSession/WorldTrackingProvider Entities are parented to a holder that’s aligned to a wall via plane/mesh raycasts.
What I’ve tried:
Create a WorldAnchor at placement, save UUID + full 4×4 transform On next launch, re-create the WorldAnchor (or set the saved transform) and attach the entity Gate restore on relocalization/mesh updates and disable all raycast/search after restore Issue: After relaunch, placement still resolves relative to current device pose, not the same wall position.
Questions:
Is there a public API in visionOS 2.0 to persist app‑managed world anchors across sessions (room‑fixed), e.g., AnchorStore or equivalent?
If not, what’s the recommended pattern to reliably restore wall‑anchored content?
Are persistence features mentioned for widgets/windows available to third‑party RealityKit entities?
0
I’m using ARKit + SceneKit (Swift) with ARWorldTrackingConfiguration and detectionImages to place a 3D object (USDZ via SCNScene(named:)) when a reference image is detected. While the image is tracked, the object stays correctly aligned.
Goal: When the tracked image is no longer visible, I want the placed node to remain visible and fixed at its last known pose (no drifting) as I move the camera.
What works so far: Detect image → add node → track updates When the image disappears → keep showing the node at its last pose
Problem: After the image is no longer tracked, the node drifts as I move the device/camera. It looks like it’s still influenced by the (now unreliable) image anchor or accumulating small world-tracking errors.
Question: What’s the correct way in ARKit to “freeze” the node at its last known world transform once ARImageAnchor stops tracking, so it doesn’t drift?
We enabled ARKit replay data and chose a .mov file from reality composer. However, the app does not launch after build and run in Xcode. It stays stuck saying "Attaching to app on iPad".
Does anyone have this problem?
System Information:
macOS Version 14.6.1 (Build 23G93)
We are working on a world scale AR app that leverages the device location and heading to place objects in the streets, so that they are correctly and stably anchored to certain locations.
Since the geo-tracking imagery is only available in certain cities and areas, we are trying to figure out how to fallback when geo-tracking is not available as the device move away, to still retain good AR camera accuracy. We might need to come up with some algorithm using the device GPS, to line up the ARCamera with our objects.
Question: Does geo-tracking always provide greater than or equal to the accuracy of world tracking, for a GPS outdoor AR experience?
If so, we can simply use the ARGeoTrackingConfiguration for the entire time, and rely on the ARView keeping itself aligned. Otherwise, we need to switch between it and ARWorldTrackingConfiguration when geo-tracking is not available and/or its accuracy is low, then roll our own algorithm to keep the camera aligned.
Thanks.