Here is the sample project from apple of Object Tracking.
https://developer.apple.com/documentation/visionOS/exploring_object_tracking_with_arkit
can we improve it tracking accuracy and tracking when object is moving little faster, so the 3d object that draw still follow it and make it more accurate.
ARKit
RSS for tagIntegrate iOS device camera and motion features to produce augmented reality experiences in your app or game using ARKit.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
I'm developing an AR application for the iPad pro where the primary purpose is to overlay 3D design data on top of production parts. For alignment, we are using Vuforia (model targets) which work really well locally. The further the device is moved from the point of original alignment, we are seeing quite a bit of overlay error (drift?).
My primary questions are:
Are there any best practices to stabilize frame-to-frame tracking when using model targets? We are noticing drift as soon as the device starts moving (the drift appears to occur specifically in the direction the device is moving). After about 15 feet of movement, we are observing about 3-6" of overlay error
These use cases can be over 100 feet long. In order to reset drift, we understand we'll need multiple alignment points (model targets) along the way. Is there a standard/best practice for this? Ex: have a new alignment point every x-feet?
We are using plane anchors to set our alignment. Typically we attach it to the nearest plane; however, the anchor point can be very far away (the origin of the model, which often is not near where the virtual content is). Could this be the issue? The anchor is far from the plane that we attach it too. Would moving the anchor closer to the plane we attach it too improve stability? After a few steps, the plane we originally attach too will be out of FoV anyway.
Thanks in advance!
Topic:
Spatial Computing
SubTopic:
ARKit
Hi Apple Team and Developers,
First of all, I’d like to express my appreciation for the incredible results achieved using PhotogrammetrySession. I’ve been developing a portrait scanning app using Object Capture, and in many tests—especially with human models—I’ve found the reconstructed body surfaces are remarkably smooth and clean, often outperforming tools like Metashape and RealityCapture in terms of aesthetic results.
However, I’ve encountered some challenges when working with complex areas like long hair overlapping the face. For instance, with female models where strands of hair partially occlude the face, the resulting mesh tends to merge the hair and facial geometry. This leads to distorted or “melted” facial features, likely due to ambiguity in the geometry estimation phase.
Feature Suggestion:
Would it be possible to allow developers to supply two versions of the input images:
• One version (original) for texture generation
• A pre-processed version (e.g., contrast-enhanced or CLAHE filtered) to guide mesh reconstruction only
This would give us the flexibility to enhance edge features or shadow detail without affecting the final texture appearance. In other photogrammetry pipelines, applying image enhancement selectively before dense reconstruction improves geometry quality in low-contrast areas.
Question:
Is there any plan to support this kind of two-path workflow in future versions of PhotogrammetrySession? Or perhaps expose more intermediate stages or tunable parameters to developers?
Also, any hints on what we can expect from WWDC 2025 regarding improvements to Object Capture or related vision/3D technologies?
Thanks again for this powerful API. Looking forward to hearing insights from the team and other developers.
Warm regards,
KitCheng
Hi Apple Team,
I’m working on a human portrait scanning application using PhotogrammetrySession, and I’ve been very impressed by the results. Thank you for building such a powerful and accessible photogrammetry solution into macOS!
I do, however, have a question regarding mesh detail limitations on different Mac hardware configurations.
When using PhotogrammetrySession.Request.Detail.custom and trying to set maximumPolygonCount = 1000000, I see the following log message:
Clamped max poly count: 1000000 to device limit. 250000 is used.
This is on an M1 Max with 32 GB RAM.
I’m aware that PhotogrammetrySession.limits can report values like maximumInputImageDimension and maximumNumberOfInputImages, but I haven’t found documentation on how the maximumPolygonCount is determined, and what hardware specs influence it.
Is it tied more to:
• GPU performance (e.g. neural/graphics cores)?
• CPU architecture?
• Memory size or bandwidth?
• Or is it fixed per SoC generation?
I’d love to understand what kind of hardware upgrades (e.g. moving to M4 Pro or increasing RAM) could allow me to increase mesh complexity and generate more detailed models.
Any insights would be greatly appreciated—and if this is covered in upcoming WWDC sessions or documentation, I’d be happy to tune in.
Thanks in advance!
KitCheng
I am allowing users to go through and capture different rooms, and add a custom label to that room. Is there a way to store data about this in the captured room so that it persists into the final merge? As it is now, My users mark all their merges with custom labels, but after merging there is no way to remember which room is which in the merging process so they have to go through and manually add the labels back. For larger floor plans this is not ideal.
I need to loop my videoMaterial and I don't know how to make it happen in my code.
I have included an image of my videoMaterial code.
Any help making this happen with be greatly appreciated.
Thank you,
Christopher
Topic:
Spatial Computing
SubTopic:
ARKit
Using the example code posted here:
https://developer.apple.com/documentation/visionOS/tracking-images-in-3d-space
I can register multiple ReferenceImage s with a ImageTrackingProvider, but only one updates at a time - to have realtime updating, I can only have one ImageAnchor in my field of view at a time.
Is it possible to track multiple imageAnchors at the same time in the same field of view? As in having several ImageAnchor's tracked and entities updated to the transforms of the anchor in the same frame/moment from the Apple Vision Pro?
Topic:
Spatial Computing
SubTopic:
ARKit
A few users have recently reported no longer being able to capture point clouds using our app, specifically on iPhone 15 Pro devices. We recently found an in-house device that exhibits this behavior and found that the confidenceMap contains only low confidence values, regardless of the environment being captured. Our app uses a higher confidence threshold; setting the threshold to a lower value produces noisy results as expected, so that is a non-viable option.
Other LiDAR based apps have been tested with this device and the results are the same. No points, or noisy point clouds in apps that allow a lower confidence threshold setting. On devices that exhibit this behavior the "Displaying a point cloud using scene depth" Apple sample app can be used to visualize the issue.
First reports of this new behavior occurred as early as iOS 18.4.
Looking for recommendations on which team(s) at Apple to reach out to with these findings since the behavior manifests on only a small sample of devices.
I have two RealityView: ParentView and When click the button in ParentView, ChildView will be shown as full screen cover, but the camera feed in ChildView will not be shown, only black screen.
If I show ChildView directly, it works with camera feed.
Please help me on this issue? Thanks.
import RealityKit
import SwiftUI
struct ParentView: View{
@State private var showIt = false
var body: some View{
ZStack{
RealityView{content in
content.camera = .virtual
let box = ModelEntity(mesh: MeshResource.generateSphere(radius: 0.2),materials: [createSimpleMaterial(color: .red)])
content.add(box)
}
Button("Click here"){
showIt = true
}
}
.fullScreenCover(isPresented: $showIt){
ChildView()
.overlay(
Button("Close"){
showIt = false
}.padding(20),
alignment: .bottomLeading
)
}
.ignoresSafeArea(.all)
}
}
import ARKit
import RealityKit
import SwiftUI
struct ChildView: View{
var body: some View{
RealityView{content in
content.camera = .spatialTracking
}
}
}
Hello!
I'm a developer of Ping Pong for Apple Vision Pro (Ping Pong Club). I'm experiencing an issues with collisions between ball and paddle. Sometimes the ball just unexpectedly flying away like it has been hit to hard.
I believe this is happening because hand anchors are jittering (ping pong paddle has hand anchor component) and at the time of collision unnecessary micro motion produced.
I've build a demo app to demonstrate that: https://drive.google.com/file/d/1gBh-DsXuVZdvDrrD7gJcbJ_hg7tvVKUV/view?usp=share_link
Video: https://drive.google.com/file/d/1poPbsJdBz87edqrWIfBt1eSVzfm5CTek/view?usp=sharing
Is there a way to add some threshold to prevent jittering?
Topic:
Spatial Computing
SubTopic:
ARKit
I work on motion capture systems for VTubing. I can't seem to find any information on gaining access to the Face Tracking features on iOS while developing for Vision OS.
I would love to bring VStreamer Live to Vision OS
Topic:
Spatial Computing
SubTopic:
ARKit
Hello everyone, I'm a new developer and I'm still learning the foundations of Swift and SwiftUI while building my first app. Today I wanted to ask you how to implement AR Quixck Views inside my app. I wanna be able to dynamically preview AR objects in a dedicated view, however, I don't seem to have understood where and how to locate AR objects inside my project. I tried including them in the Assets folder of the project, or in the Recources folder, or within the main folder of my project alongside the MyAppApp.swift file. None of the methods I used seemed have worked in that none of the objects was ever located. I made sure to specify the path to the files every time, but somehow the location isn't recognized. I also tried giving no path so that the app would search for the files in their default location (which I apparently haven't grasped yet), but still my attempt failed. I don't have the code sample on me at the moment, but I will write a followup comment on this post to show you what I wrote in case anyone was interested in debugging my code. Meanwhile, if anyone would be so kind to point me at the support article or to comment below the sample code they used in their app, I would very much appreciate it, so that I can start debugging. Thank you for reading this, I appreciate you.
Hi all,
I'm currently developing a real-time object reconstruction app using ARKit. The goal is to scan large objects using ARKit’s depth and transform data, and generate a point cloud.
However, I’m facing a major challenge - Transform Drift / World Alignment Issues
The localToWorld transform provided by ARKit frequently seems to drift or become unstable across frames.
This results in misaligned point clouds even when the device is moved slowly or kept relatively still.
In some cases, a static surface scanned over a few seconds results in clearly misaligned fragments.
This makes it difficult to accurately stitch a multi-frame point cloud. I have experimented with various lighting conditions and object textures, but the issue persists in all cases. At times, the relative error between frames reaches up to 20 cm, while in other instances the error is minimal; however, the drift gradually accumulates over time, leading to an overall enlargement of the reconstructed object. I have attached images of both cases here.
Questions:
Are there specific conditions under which ARKit’s world transform is expected to drift?
Is there a way to detect or recover from this drift during runtime?
Any best practices for maintaining consistent tracking during scanning or measurement sessions?
My ARViewContainer code is not working. I don't know how to debug the issue and I don't know or see where my results is going to. I need help to resolve this issue. please help debug. See code below:
So it seems to be that there is a contradiction between how ARKit defines UIDeviceOrientation.landscapeRight, and the actual definition of UIDeviceOrientation.landscapeRight in the UIKit documentation.
In the ARKit documentation for ARCamera.transform, it says the following:
This transform creates a local coordinate space for the camera that is constant with respect to device orientation. In camera space, the x-axis points to the right when the device is in UIDeviceOrientation.landscapeRight orientation—that is, the x-axis always points along the long axis of the device, from the front-facing camera toward the Home button. The y-axis points upward (with respect to UIDeviceOrientation.landscapeRight orientation), and the z-axis points away from the device on the screen side.
Going through the same link, we see the definition of UIDeviceOrientation.landscapeRight given as:
The device is in landscape mode, with the device held upright and the front-facing camera on the right side.
There seems to be a conflict in the two definitions, that has already been asked and visualized in this StackOverflow thread
The resolution of that answer says that ARKit landscapeRight, unlike what is given in UIDeviceOrientation.landscapeRight, has home button on the right, as stated in the ARCamera.transform documentation.
It says that more details are given in this StackOverflow thread, but this thread talks about the discrepancy between the definitions of landscapeRight in UIDeviceOrientation and UIInterfaceOrientation, and not anything related to ARKit.
So I am wondering, why does ARKit definition of landscapeRight contradict with that of UIDeviceOrientation despite explicitly mentioning it? Is it just a mistake by Apple developers that hasn't been resolved even after so long?
With iOS26 unveiled, has anyone noticed or found any changes related to RoomPlan?
I can't find anything myself, which is disappointing.
Has anyone found any improvements or changes?
Hi all,
I'm running into an issue with an app that previously worked fine on device using visionOS 2.0. After updating to visionOS 26, the same code runs fine in the simulator but crashes on the device with the following error:
-[MTLDebugComputeCommandEncoder _validateThreadsPerThreadgroup:]:1330:
failed assertion `(threadsPerThreadgroup.width(32) * threadsPerThreadgroup.height(32) * threadsPerThreadgroup.depth(1))(1024) must be <= 832. (kernel threadgroup size limit)`
Is there any documented way to check or increase the allowed threadsPerThreadgroup size on Apple Vision Pro? Or any recommended workaround for this regression?
Thanks in advance!
Hi all,
I'm working on an ARKit-based iOS app where I need to accurately determine the direction the device is facing to localize objects in the real world. I'm using:
let config = ARWorldTrackingConfiguration()
config.worldAlignment = .gravityAndHeading
Thus, I would expect the world alignment to behave as given in the gravityAndHeading page.
The AR session is started after verifying that CLLocationManager.headingAccuracy <= 20, and the compass appears to be calibrated.
However, I'm seeing a major inconsistency:
When the rear camera is physically pointed toward true North, I would expect:
cameraTransform.columns.2.z ≈ -1 // (i.e. ARKit's -Z pointing North)
But instead, I'm consistently seeing:
cameraTransform.columns.2.z ≈ +0.97 // Implies camera is facing South
Meanwhile, the translation vector behaves as expected:
As I physically move North, cameraTransform.columns.3.z becomes more negative, matching the world’s +Z = South assumption.
For example, let's say I have the device in landscapeRight (or landscapeLeft for UIDeviceOrientation). Let's say the device rear camera is pointing towards True North, and I start moving towards True North. I get something like this:
Camera Transform = simd_float4x4(
[
[0.98446155, -0.030119859, 0.172998, 0.0],
[0.023979114, 0.9990097, 0.037477385, 0.0],
[-0.17395553, -0.032746706, 0.98420894, 0.0],
[0.024039675, -0.037087332, -0.22780673, 0.99999994]
])
As you can see, the cameraTransform.columns.2.z is positive despite the rear camera pointing towards True North, while cameraTransform.columns.3.z is correctly positive as the device is moving towards True North.
So here is my question:
Why is cameraTransform.columns.2.z positive when the rear camera is physically facing North?
Any clarity would be deeply appreciated. I've read the documentation and tested with different heading accuracies and AR session resets, but I keep running into this orientation mismatch.
Thanks in advance!
Hello
RemoteDeviceIdentifier returns nil and therefore it crashes the HoverEffect sample project.
I have vision26 beta 2 on both devices
what the correct method of running this code sample ?
Hi 26 beta guys,
I have apps using ARKit.
In iPadOS 26 beta, ARKit stops working after switching to other apps.
how to:
Enable WindowMode in iPadOS 26
Launch my app and start ARSession
Switch to another app (preference app, etc.)
Switch back to my app
AR stops updating camerafeed.
I debug printed ARSessionDelegate, and found that
after sessionWasInterrupted was called, sessionInterruptionEnded was never called.
sessionInterruptionEnded is called if WindowMode disabled.
Is this just a bug for 26 beta?
I suspect there is similar problem with non-AR camera.
Any idea?
Topic:
Spatial Computing
SubTopic:
ARKit