0
I’m using ARKit + SceneKit (Swift) with ARWorldTrackingConfiguration and detectionImages to place a 3D object (USDZ via SCNScene(named:)) when a reference image is detected. While the image is tracked, the object stays correctly aligned.
Goal: When the tracked image is no longer visible, I want the placed node to remain visible and fixed at its last known pose (no drifting) as I move the camera.
What works so far: Detect image → add node → track updates When the image disappears → keep showing the node at its last pose
Problem: After the image is no longer tracked, the node drifts as I move the device/camera. It looks like it’s still influenced by the (now unreliable) image anchor or accumulating small world-tracking errors.
Question: What’s the correct way in ARKit to “freeze” the node at its last known world transform once ARImageAnchor stops tracking, so it doesn’t drift?
ARKit
RSS for tagIntegrate iOS device camera and motion features to produce augmented reality experiences in your app or game using ARKit.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Created
Hi everyone,
We’re developing a Unity project for Apple Vision Pro that connects PSVR2 Sense controllers for advanced interaction and input.
We’ve encountered a major limitation:
when the controller is not held close to the designated hand (e.g., resting on a table or held by the non designated hand), the Sense controller enters a low-power or reduced-update mode. This results in noticeably reduced tracking update frequency and responsiveness until the controller is held again.
For certain use cases, this behavior is undesirable. In our case, it prevents continuous real-time tracking of the controller even when it’s stationary or being tracked externally.
Request:
Please consider exposing an API flag or developer option in ARKit to disable and optionally delay the low-power mode when the app requires full-rate updates regardless of proximity or hand pose detection.
I use ARKit for motion tracking. I get the skeleton joint coordinates and use them for animation. I didn't make any changes to the code, but I updated the iOS version from 18 to 26, and modelTransform now always returns nil.
https://developer.apple.com/documentation/arkit/arskeleton3d/modeltransform(for:)
For example
bodyAnchor.skeleton.modelTransform(for: .init(rawValue: "head_joint"))
bodyAnchor is ARBodyAnchor.
I see the default skeleton on the screen, but now I can't get the coordinates out of it.
I'm using an example from Apple's WWDC presentation.
https://developer.apple.com/documentation/arkit/capturing-body-motion-in-3d
Are there any changes in the API? Or just bug?
I'm working on a project that uses imageTrackingProvider through ARKit on VisionPro, and I want to detect multiple images(about 5) and show info at the same time.
However, I found that it seems only 1 image could be detected by device at one time.
And the api of maximumNumberOfTrackedImages doing this seems not available for visionOS but only iOS.
Anyone knows possible ways to detect multiple images at the same time on VisionPro?
Topic:
Spatial Computing
SubTopic:
ARKit
I have read in the apple documentation and on forums that in order to access the camera and capture images on VisionPro, both an Entitlement and an Enterprise.license are required. I already have the Entitlement, but I don’t yet have the Enterprise.license. I would like to ask: is the Enterprise.license strictly required to gain camera access for capturing images? How can I obtain this file, and does it require an Enterprise account? Currently, my developer account is a regular Developer 99$, not an Enterprise account.
Topic:
Spatial Computing
SubTopic:
ARKit
https://developer.apple.com/documentation/arkit/arpointcloud
https://developer.apple.com/documentation/arkit/arframe/rawfeaturepoints
The point cloud (collection of points/features) main intention is a debug visualization to what the underlying tracking algorithm processes and is not designed for additional algorithms on top of that. But, we are utilizing information contained in the points/features collected by ARKit.
Currently, the range of rawfeaturepoints is limited to about 10 meter from the device.
We see a great chance if the range is unlock. The global localization will be more robust and accurate.
ARPointCloud - Apple ARKit - FindSurface
YouTube SIdQRiLj2jY
Hi,
I’m trying to configure camera feed in ARKit to be in Apple Log color space.
I can change Capture Device’s format to one that has Apple Log and I see one frame being in proper log-gray colors but then all AR tracking stops and tracking state hangs at “initializing”. In other combinations I see error “sensor failed to initialize” and session restarts with default format.
I suspect that this is because normal AR capture formats are 420f, whereas ones that have Apple Log are 422.
Could someone confirm if it’s even possible to run ARKit session with camera feed in a different pixel format?
I’m trying it on iphone 15 pro
Basically, take just the Xcode 26 AR App template, where we put the ContentView as the detail end of a NavigationStack.
Opening app, the app uses < 20MB of memory. Tapping on Open AR the memory usage goes up to ~700MB for the AR Scene. Tapping back, the memory stays up at ~700MB.
Checking with Debug memory graph I can still see all the RealityKit classes in the memory, like ARView, ARRenderView, ARSessionManager.
Here's the sample app to illustrate the issue.
PS: To keep memory pressure on the system low, there should be a way of freeing all the memory the AR uses for apps that only occasionally show AR scenes.
When I run my app from Xcode on a device running iOS 26, the roomplan capture is corrupted and the recording is green and purple. This issue does not occur when I use an older version of iOS or when I run the app via testFlight or the App Store.
Recently, questions about ARKit/visionOS seem to be being asked in the Apple forum by internal Apple engineers. Inexperienced and untested makeshift features are being offered, putting average but experienced developers in a difficult position. They are unable to react and get something useful from the posts. Apple needs to review the situation.
Topic:
Spatial Computing
SubTopic:
ARKit
Is there any way to render a RealityView to an Image/UIImage like we used to be able to do using SCNView.snapshot() ?
ImageRenderer doesn't work because it renders a SwiftUI view hierarchy, and I need the currently presented RealityView with camera background and 3D scene content the way the user sees it
I tried UIHostingController and UIGraphicsImageRenderer like
extension View {
func snapshot() -> UIImage {
let controller = UIHostingController(rootView: self)
let view = controller.view
let targetSize = controller.view.intrinsicContentSize
view?.bounds = CGRect(origin: .zero, size: targetSize)
view?.backgroundColor = .clear
let renderer = UIGraphicsImageRenderer(size: targetSize)
return renderer.image { _ in
view?.drawHierarchy(in: view!.bounds, afterScreenUpdates: true)
}
}
}
but that leads to the app freezing and sending an infinite loop of
[CAMetalLayer nextDrawable] returning nil because allocation failed.
Same thing happens when I try
return renderer.image { ctx in
view.layer.render(in: ctx.cgContext)
}
Now that SceneKit is deprecated, I didn't want to start a new app using deprecated APIs.
I thought the ARCoachingOverlayView was a nice touch, so each apps ARKit coaching was recognizable and I used it in my ARView/ARSCNView based apps.
Now with RealityView, is there any replacement planned?
Or should we just use UIViewRepresentable and wrap ARCoachingOverlayView?
I have a problem with the wall plane detection using visionOS/ARKit:
I am using ARKitSession's PlaneDetectionProvider detection.wall in the space of visionOS. I recorded the position and rotation information of the first detected plane, but found that the rotation value will be facing when the user starts the space. There is a deviation in different directions. That is to say, even if the plane is located on the same wall, the rotation quaternion will be different.
I hope that no matter from which direction the user enters the scan, the real direction of the wall can be correctly obtained so that the virtual content can be accurately aligned with the wall.
I have tried to use anchor.originFromAnchorTransform or Transform.rotation directly, but the rotation value is still affected by the user's initial orientation.
In addition, I would like to know whether the user's initial orientation will affect the location information. If so, please provide a solution.
Thank you!
Hi, I called it "perspective problem", but I'm not quite sure what it is. I have a tag that I track with builtin camera. I calculate its pose, then use extrinsics and device anchor to calculate where to place entity with model.
When I place an entity that overlaps with physical object and start to look at it from different angles, the virtual object begins to move. Initially I thought that it's something wrong with calculations, or some image distortion closer to camera edges is affecting tag detection. To check, I calculated the position only once and displayed entity there, the physical tracked object is not moving. Now, when I move my head, so the object is more to the left, or right in my field of view, the virtual object becomes misaligned to the left, or right. It feels like a parallax effect, but distance from me to entity and to physical object are exactly the same.
Is that expected, because of some passthrough correction magic? And if so, can I somehow correct it back, so the entity always overlaps with object? I'm currently on v26 beta 5.
I also don't quite understand the camera extrinsics, because it seems that I need to flip it around X by 180 degrees to make it work in deviceAnchor * extrinsics.inverse * tag (shouldn't it be in same coordinates as all other RealityKit things?).
The AR based app I am working on right now is experiencing an issue. Sometimes, the AR session fails with a call to my ARSessionObserver's session(_ session: ARSession, didFailWithError error: Error)
with the following error:
Error Domain=com.apple.arkit.error
Code=102 "Required sensor failed."
NSLocalizedFailureReason="A sensor failed to deliver the required input.,"
NSLocalizedRecoverySuggestion="Make sure that the application has the required privacy settings."
The underlying error seems to point to the CoreMotion framework:
Domain=CMErrorDomain
Code=102 "(null)
Some people seem to have experienced this issue and solved it by making sure that the Compass Calibration switch is ON in Settings > Privacy > Location Services > System Services.
For context, the ARWorldTrackingConfiguration.worldAlignment is set to .gravity
The thing is it is already ON when I experience this issue.
I also noticed that this issue happens way more often on the iPhone 16e than in any other device.
Has anyone had similar experiences? I am looking for a way to prevent this error from happening (ideally) or handling in a way that does not affect the user. Any help is appreciated
In a simple test, I'm observing ~30% higher CPU usage with the ARWorldTrackingConfiguration compared to the ARBodyTrackingConfiguration when both configurations have AREnvironmentTexturing enabled.
In Instruments, I observe Recon3D consuming ~5.5 seconds of CPU time with the ARWorldTrackingConfiguration vs <0.3 second with the ARBodyTrackingConfiguration in two separate 30 seconds samples.
This is on an iPhone 12 Pro equipped with lidar.
Is there a reason why two separate configurations, both having the same features enabled would have a different CPU overhead?
I like to compose an APN message. (using FCM)
what shall I do for it?
Topic:
Spatial Computing
SubTopic:
ARKit
Visionos26 Enterprise api has the new feature: Shared Coordinate Space, participants exchange their coordinate data by SharedCoordinateSpaceProvider through their own network, when shared coordinate space established with nearby participants, the event: connectedParticipantIdentifiers(participants: [UUID]) will be received.
But the Event.participantIdentifier still be an invalid default value(00000000-0000-0000-FFFF-FFFFFFFF) in this time, I wonder when or how I can get a valid event.participantIdentifier, or is there some other way to get the local participantIdentifier?
Or If it's a bug, please fix it in later beta release version, thank you.
Hi!
I'm currently experimenting on Apple Vision Pro with hand and head anchors. Is there a way to get an anchor linked to the apple magic keyboard (as the detection is already done to display inputs at the top)?
Thanks in advance,
Have a good day!
Hello, I am trying to develop an app that broadcasts what the user sees via Apple Vision Pro. I am a graduate student studying at the university.
And I have two problems,
If I want to use passthrough in screen capture (in VisionOS), do I have to join Apple Developer Enterprise Program to get Enterprise API?
and Can I buy Apple Developer Enterprise Program (Enterprise API) with my university account?
Have any of you been able to do this?
Thank you