I use ARKit's hand tracking to attach a 3D model of a remote control to the left hand. The user is supposed to press buttons on the remote control. In the Vision Pro settings, I have removed the left hand from Hands & Eye Tracking. Only the right hand is used. The problem now is that the left hand appears and the 3D model of the remote control fades out. I want the remote control to be completely visible. The user should feel like they really have the remote control in their hand. Can I prevent the fading out?
ARKit
RSS for tagIntegrate iOS device camera and motion features to produce augmented reality experiences in your app or game using ARKit.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
While using apple's vision pro, we noticed that we can continue to use the visionOS keyboard when we no longer actually see it in passthrough.
In other words, when we focus on a field to type, visionOS displays the keyboard for us in such a way that we actually see it. Then, we noticed if we look away a little bit, either up, or down, or left, or right, in such a way that the keyboard is no longer visible by us in the passthrough, the keyboard still remains responsive to taps from our fingers at the location where it is. It seems the keyboard remains functional and responsive to taps even though we can no longer observe/see it.
We are trying to figure out how to implement similar functionality in our app whereby the user can continue to manipulate a 3d entity when the user can no longer actually observe it in passthrough (like the visionOS keyboard appears to allow).
I assume the visionOS keyboard has this functionality thanks to the downward facing sensors on the hardware that allow hand tracking even though the hands can no longer be observed by the user. That is likely how we can rest our hands on our lap is still be able to interact with visionOS.
How can we implement a similar functionality for 3D entities?
Is there a way to tap in, or to allow hand tracking, from those toward facing cameras?
Is it possible to manipulate a 3D entity when it is no longer observed by the user for example when they shift their attention somewhere else in the field of vision?
How does the visionOS keyboard achieve this?
I have a visionOS app where I instantiate ARKitSession and various providers (HandTrackingProvider and WorldTrackingProvider) in my appModel. That way, I can pass these providers to a Task which runs a gRPC server for sending the data from these providers to a client. When the users enters the immersive space of the app, the ARKitSession will run the providers if they are not running already.
I am now trying to implement the AccessoryTrackingProvider with the PSVR sense controllers but it does not fit with my current framework because the controllers may not be connected when the ARKitSession.run function is called. So I need to find a new place to start the session.
My question is, if I already have a session which is running the hand and world tracking providers, can I start another session to run the accessory tracking? Should they all be running on the same session?
Is there a way to stop the session and restart it when the controllers are connected? When I tried this, I get an error that says "It is not possible to re-run a stopped data provider (<ar_hand_tracking_provider_t: " but if I instantiate a new HandTrackingProvider, then the one that got passed to the gRPC task would no longer be the one running in the new session.
Any advice on how best to manage the various providers and ARKit sessions would be greatly appreciated.
Using the example code posted here:
https://developer.apple.com/documentation/visionOS/tracking-images-in-3d-space
I can register multiple ReferenceImage s with a ImageTrackingProvider, but only one updates at a time - to have realtime updating, I can only have one ImageAnchor in my field of view at a time.
Is it possible to track multiple imageAnchors at the same time in the same field of view? As in having several ImageAnchor's tracked and entities updated to the transforms of the anchor in the same frame/moment from the Apple Vision Pro?
Topic:
Spatial Computing
SubTopic:
ARKit
0
I’m using ARKit + SceneKit (Swift) with ARWorldTrackingConfiguration and detectionImages to place a 3D object (USDZ via SCNScene(named:)) when a reference image is detected. While the image is tracked, the object stays correctly aligned.
Goal: When the tracked image is no longer visible, I want the placed node to remain visible and fixed at its last known pose (no drifting) as I move the camera.
What works so far: Detect image → add node → track updates When the image disappears → keep showing the node at its last pose
Problem: After the image is no longer tracked, the node drifts as I move the device/camera. It looks like it’s still influenced by the (now unreliable) image anchor or accumulating small world-tracking errors.
Question: What’s the correct way in ARKit to “freeze” the node at its last known world transform once ARImageAnchor stops tracking, so it doesn’t drift?
Hi everyone,
We’re developing a Unity project for Apple Vision Pro that connects PSVR2 Sense controllers for advanced interaction and input.
We’ve encountered a major limitation:
when the controller is not held close to the designated hand (e.g., resting on a table or held by the non designated hand), the Sense controller enters a low-power or reduced-update mode. This results in noticeably reduced tracking update frequency and responsiveness until the controller is held again.
For certain use cases, this behavior is undesirable. In our case, it prevents continuous real-time tracking of the controller even when it’s stationary or being tracked externally.
Request:
Please consider exposing an API flag or developer option in ARKit to disable and optionally delay the low-power mode when the app requires full-rate updates regardless of proximity or hand pose detection.
Hi, I have a hand model that is in FBX and I'm exporting it to USD in Blender. I get a skinned mesh and while I can track the whole hand how do I track each joint and assign it and animate the skinned mesh itself. All my attempts say this is not possible in RealityKit as of now. True?
We are working on a world scale AR app that leverages the device location and heading to place objects in the streets, so that they are correctly and stably anchored to certain locations.
Since the geo-tracking imagery is only available in certain cities and areas, we are trying to figure out how to fallback when geo-tracking is not available as the device move away, to still retain good AR camera accuracy. We might need to come up with some algorithm using the device GPS, to line up the ARCamera with our objects.
Question: Does geo-tracking always provide greater than or equal to the accuracy of world tracking, for a GPS outdoor AR experience?
If so, we can simply use the ARGeoTrackingConfiguration for the entire time, and rely on the ARView keeping itself aligned. Otherwise, we need to switch between it and ARWorldTrackingConfiguration when geo-tracking is not available and/or its accuracy is low, then roll our own algorithm to keep the camera aligned.
Thanks.
I have read in the apple documentation and on forums that in order to access the camera and capture images on VisionPro, both an Entitlement and an Enterprise.license are required. I already have the Entitlement, but I don’t yet have the Enterprise.license. I would like to ask: is the Enterprise.license strictly required to gain camera access for capturing images? How can I obtain this file, and does it require an Enterprise account? Currently, my developer account is a regular Developer 99$, not an Enterprise account.
Topic:
Spatial Computing
SubTopic:
ARKit
When scanning multiple rooms (10+) in a single structure using ARWorldMap for coordinate space consistency, RoomCaptureSession throws CaptureError.exceedSceneSizeLimit. The instructions here (https://developer.apple.com/documentation/roomplan/scanning-the-rooms-of-a-single-structure) provide exactly what I am doing to keep the underlying ARSession alive (by calling captureSession.stop(pause: false)) and save the results before a user moves to the next room. Scanning 11 or so rooms will cause the user to hit the exceedSceneSizeLimit error. The ARWorldMap is about 58 MB and always is around this size when hitting this issue. No anchors are present and all the data seems to be from tracking data.
On iPad devices (where I do not see this issue) the ARWorldMap grows as a significantly slower rate in size.
I save the ARWorldMap after each room is scanned and confirmed by the user. If I use the ARMap to initialize the ARSession (as described in the docs) the session will immediately error with "exceedSceneSizeLimit" once the captureSession.run() is executed. Occasionally it will allow me/the user to scan again, but either breaks mid scan or the following.
This has been working fine for the past 2 years and users have been able to scan dozens of rooms without issue. It seems only lately that it has been a problem.
I would expect the ARWorldMap to be allowed for much bigger sizes. At this point I can just about scan more area of my house with a single scan than I can when I use different captureSessions.
Few observations:
This happens on my iPhone 15 Pro Max, my iPhone 17 Pro, but not my iPad M4 (maybe memory related?). It is possible if scanning many more rooms it would happen on the iPad too.
I have tried things such as resetting the ARConfig on the underlying ARSession to reset some, but this doesn't work.
I have tried to create a new ARWorldMap and move the origin to the older map to clear out tracking data. This almost works but causes a mess of issues when a user moves at all due to the unshared coordinate space.
I believe there are three active issues regarding this: FB14454922, FB15035788, FB20642944
Could we get an update for this issue? It is a production issue and severely limits my user experience in my production application.
I'm working on a project that uses imageTrackingProvider through ARKit on VisionPro, and I want to detect multiple images(about 5) and show info at the same time.
However, I found that it seems only 1 image could be detected by device at one time.
And the api of maximumNumberOfTrackedImages doing this seems not available for visionOS but only iOS.
Anyone knows possible ways to detect multiple images at the same time on VisionPro?
Topic:
Spatial Computing
SubTopic:
ARKit
Hi,
I’m trying to configure camera feed in ARKit to be in Apple Log color space.
I can change Capture Device’s format to one that has Apple Log and I see one frame being in proper log-gray colors but then all AR tracking stops and tracking state hangs at “initializing”. In other combinations I see error “sensor failed to initialize” and session restarts with default format.
I suspect that this is because normal AR capture formats are 420f, whereas ones that have Apple Log are 422.
Could someone confirm if it’s even possible to run ARKit session with camera feed in a different pixel format?
I’m trying it on iphone 15 pro
I am working on a project that requires access to the main camera on the Vision Pro. My main account holder applied for the necessary enterprise entitlement and we were approved and received the Enterprise.license file by email. I have added the Enterprise.license file to my project, and manually added the com.apple.developer.arkit.main-camera-access.allow entitlement to the entitlement file and set it to true since it was not available in the list when I tried to use the + Capability button in the Signing & Capabilites tab.
I am getting an error: Provisioning profile "iOS Team Provisioning Profile: " doesn't include the com.apple.developer.arkit.main-camera-access.allow entitlement. I have checked the provisioning profile settings online, and there is no manual option for adding the main camera access entitlement, and it does not seem to be getting the approval from the license.
Basically, take just the Xcode 26 AR App template, where we put the ContentView as the detail end of a NavigationStack.
Opening app, the app uses < 20MB of memory. Tapping on Open AR the memory usage goes up to ~700MB for the AR Scene. Tapping back, the memory stays up at ~700MB.
Checking with Debug memory graph I can still see all the RealityKit classes in the memory, like ARView, ARRenderView, ARSessionManager.
Here's the sample app to illustrate the issue.
PS: To keep memory pressure on the system low, there should be a way of freeing all the memory the AR uses for apps that only occasionally show AR scenes.
When I run my app from Xcode on a device running iOS 26, the roomplan capture is corrupted and the recording is green and purple. This issue does not occur when I use an older version of iOS or when I run the app via testFlight or the App Store.
Hello Community,
I'm encountering an issue with the latest iOS 17 update, specifically related to RoomPlan version-2. In iOS 16, when using RoomPlan version-1, we were able to display stairs in our app. However, after upgrading to iOS 17 and implementing RoomPlan version-2, the stairs are no longer visible.
Despite thorough investigation, I couldn't find any option within the code to show or hide stairs, or any other objects for that matter. It seems like a specific issue with the update rather than a coding error on our part.
Has anyone else encountered a similar problem? If so, I would greatly appreciate any insights or solutions you might have. It's crucial for our app functionality to have stairs displayed accurately, and we're currently at a loss on how to address this issue.
Thank you in advance for any assistance you can provide.
Best regards
Recently, questions about ARKit/visionOS seem to be being asked in the Apple forum by internal Apple engineers. Inexperienced and untested makeshift features are being offered, putting average but experienced developers in a difficult position. They are unable to react and get something useful from the posts. Apple needs to review the situation.
Topic:
Spatial Computing
SubTopic:
ARKit
Is there any way to render a RealityView to an Image/UIImage like we used to be able to do using SCNView.snapshot() ?
ImageRenderer doesn't work because it renders a SwiftUI view hierarchy, and I need the currently presented RealityView with camera background and 3D scene content the way the user sees it
I tried UIHostingController and UIGraphicsImageRenderer like
extension View {
func snapshot() -> UIImage {
let controller = UIHostingController(rootView: self)
let view = controller.view
let targetSize = controller.view.intrinsicContentSize
view?.bounds = CGRect(origin: .zero, size: targetSize)
view?.backgroundColor = .clear
let renderer = UIGraphicsImageRenderer(size: targetSize)
return renderer.image { _ in
view?.drawHierarchy(in: view!.bounds, afterScreenUpdates: true)
}
}
}
but that leads to the app freezing and sending an infinite loop of
[CAMetalLayer nextDrawable] returning nil because allocation failed.
Same thing happens when I try
return renderer.image { ctx in
view.layer.render(in: ctx.cgContext)
}
Now that SceneKit is deprecated, I didn't want to start a new app using deprecated APIs.
I thought the ARCoachingOverlayView was a nice touch, so each apps ARKit coaching was recognizable and I used it in my ARView/ARSCNView based apps.
Now with RealityView, is there any replacement planned?
Or should we just use UIViewRepresentable and wrap ARCoachingOverlayView?
I have a problem with the wall plane detection using visionOS/ARKit:
I am using ARKitSession's PlaneDetectionProvider detection.wall in the space of visionOS. I recorded the position and rotation information of the first detected plane, but found that the rotation value will be facing when the user starts the space. There is a deviation in different directions. That is to say, even if the plane is located on the same wall, the rotation quaternion will be different.
I hope that no matter from which direction the user enters the scan, the real direction of the wall can be correctly obtained so that the virtual content can be accurately aligned with the wall.
I have tried to use anchor.originFromAnchorTransform or Transform.rotation directly, but the rotation value is still affected by the user's initial orientation.
In addition, I would like to know whether the user's initial orientation will affect the location information. If so, please provide a solution.
Thank you!