Hi, I called it "perspective problem", but I'm not quite sure what it is. I have a tag that I track with builtin camera. I calculate its pose, then use extrinsics and device anchor to calculate where to place entity with model.
When I place an entity that overlaps with physical object and start to look at it from different angles, the virtual object begins to move. Initially I thought that it's something wrong with calculations, or some image distortion closer to camera edges is affecting tag detection. To check, I calculated the position only once and displayed entity there, the physical tracked object is not moving. Now, when I move my head, so the object is more to the left, or right in my field of view, the virtual object becomes misaligned to the left, or right. It feels like a parallax effect, but distance from me to entity and to physical object are exactly the same.
Is that expected, because of some passthrough correction magic? And if so, can I somehow correct it back, so the entity always overlaps with object? I'm currently on v26 beta 5.
I also don't quite understand the camera extrinsics, because it seems that I need to flip it around X by 180 degrees to make it work in deviceAnchor * extrinsics.inverse * tag (shouldn't it be in same coordinates as all other RealityKit things?).
ARKit
RSS for tagIntegrate iOS device camera and motion features to produce augmented reality experiences in your app or game using ARKit.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
Is there any way to render a RealityView to an Image/UIImage like we used to be able to do using SCNView.snapshot() ?
ImageRenderer doesn't work because it renders a SwiftUI view hierarchy, and I need the currently presented RealityView with camera background and 3D scene content the way the user sees it
I tried UIHostingController and UIGraphicsImageRenderer like
extension View {
func snapshot() -> UIImage {
let controller = UIHostingController(rootView: self)
let view = controller.view
let targetSize = controller.view.intrinsicContentSize
view?.bounds = CGRect(origin: .zero, size: targetSize)
view?.backgroundColor = .clear
let renderer = UIGraphicsImageRenderer(size: targetSize)
return renderer.image { _ in
view?.drawHierarchy(in: view!.bounds, afterScreenUpdates: true)
}
}
}
but that leads to the app freezing and sending an infinite loop of
[CAMetalLayer nextDrawable] returning nil because allocation failed.
Same thing happens when I try
return renderer.image { ctx in
view.layer.render(in: ctx.cgContext)
}
Now that SceneKit is deprecated, I didn't want to start a new app using deprecated APIs.
My development team admin requested the Enterprise API for camera access on the vision pro. We got that granted, got a license for usage, and got instructions for integrating it with next steps.
We did the following:
Even when I try to download and run the sample project for "Accessing the Main Camera", and follow all the exact instructions mentioned here: https://developer.apple.com/documentation/visionos/accessing-the-main-camera
I am just unable to receive camera frames.
I added the capabilities, created a new provisioning profile with this access, added the entitlements to info.plist and entitlements, replaced the dummy license file with the one we were sent, and also have a matching bundle identifier and development certificate, but it is still not showing camera access for some reason.
"Main Camera Access" shows up in our Signing & Capabilities tab, and we also added the NSMainCameraDescription in the Info.plist and allow access while opening the app. None of this works. Not on my app, and not on the sample app that I just downloaded and tried to run on the Vision Pro after replacing the dummy license file.
I have a VideoMaterial inside a RealityView and want to attach this to a DockingRegion inside an immersive environment.
It appears that adding the VideoMaterial entity as a child of the docking region somewhat works, but there are no lighting effects (specular, diffuse) from the playing video.
So essentially, how can you add a VideoMaterial to a DockingRegion and achieve the same reflections/behavior as using AVPlayerViewController.
The latter is not an option as I need custom controls.
I created an app where the user:
first scans a room using RoomCaptureView (RoomPlan)
then taps on physical elements (objects, walls...) using an ARView to record some 3d positions
I can handle taps in an ARView using a UITapGestureRecognizer and the ARView raycast(from:, allowing:, alignment:) method. This works fine, so I thought I could do the same using the ARView used by RoomCaptureView., so the user can scan a room and record some 3d positions at the same time. Sadly, this approach does not work, as the raycast method always returns nil.
What I actually need is mapping a tap on screen to a real-world position during RoomCaptureSession.
Does anyone know how to do this?
Hi everyone,
I’m having trouble with image anchoring when working on a project in Reality Composer and Reality Composer Pro. Here’s the issue:
1. What I’m Trying to Achieve:
I want to create an AR scene where an object anchors to an image I provide. I don't want to create an app for this but just use the USDZ File the Scene creates. The USDZ File then should be viewable via the various integrations of AR Quick Look across the Apple Ecosystem. The image anchoring works perfectly when I preview the scene inside Reality Composer using AR mode.
2. The Problem:
When I export the project (tried both USDZ and Reality formats) and open it on my iPhone using the Files app (which uses AR Quick Look), the image anchoring no longer works. The object doesn’t anchor to the provided image as expected. It just anchors to the first plane it recognizes and not the image.
3. What I’ve Tried:
Exporting the scene in USDZ format.
Exporting the scene in Reality format.
Both formats result in the same issue: no image anchoring outside of the Reality Composer environment.
Trying different images but all resulting in same manor that the image anchoring is not working
Tried different iOS Version but resulting in the same issue
4. Current Setup:
Reality Composer Pro version: 2.0
iPhone model: iPhone 13 Pro
iOS version: 18.1.
5. What I Need Help With:
Is there a way to ensure image anchoring works in exported files when opened via AR Quick Look?
Do I need to configure something specific during the export process?
Are there limitations in AR Quick Look that prevent image anchoring from functioning correctly?
Do i need to create an app to make this work?
I’d appreciate any advice or insights from the community. If anyone has experience with similar issues or knows of a workaround, please let me know!
Thanks in advance, Mav
Topic:
Spatial Computing
SubTopic:
ARKit
Tags:
ARKit
AR Quick Look
Reality Composer
Reality Composer Pro
Hello All,
We're going to do a scene now, kind of like a time travel door. When the user selects the scene, the user passes through the door to show the current scene. The changes in the middle need to be more natural. It's even better if you can walk through an immersive space...
There is very little information now. How can I start doing this? Is there any information I can refer to
thanks
I have been digging through the docs and the developer videos, and I have noticed a mention to RealityView having som potential limitations with anchors and world tracking. However, I haven’t been able to locate my answers.
Does anyone know (or point me to) if RealityView supports everything ARView does, and if not what are the difference?
I was fooling around with RealityView today with a simple plane anchor, and the stability of that anchor didn’t seem to be as steady as I recall ARView being In the past on iPhone.
I’m trying to determine if I should be rolling over into RealityView or stay with ARView on this little educational project. I would imagine the answer is to go RealityView, but I want to make sure I’m not setting myself up for failure based on any current limitations For anchors and world data.
Topic:
Spatial Computing
SubTopic:
ARKit
In the WWDC session titled "Deep dive into volumes and immersive spaces", the developers discussed adding a Spatial Tracking Session and an Anchor Entity to detect the floor. They then glossed over some important details. They added a spatial tap gesture to let the user place content relative to the floor anchor, but they left a lot of information.
.gesture(
SpatialTapGesture(
coordinateSpace: .immersiveSpace
)
.targetedToAnyEntity()
.onEnded { value in
handleTapOnFloor(value: value)
}
)
My understanding is that an entity has to have input and collision components for gestures like this to work. How can we add a collision to an AnchorEntity when we don't know its size or shape?
I've been trying for days to understand what is happening here and I just don't get it. It is even more frustrating that the example project that Apple released does not contain any of these features.
I would like to be able
Detect the floor plane
Get the position/transform of the floor plane
Add a collider to the floor plane
Enable collisions and physics on the floor plane
Enable gestures on the floor plane
It seems to me that the Anchor Entity is placed as an entirely arbitrary position. It has absolutely no relationship to the rectangle with the floor label that I can see in the Xcode visualization. It is just a point, not a plane or rect that I can use.
I've tried manually calculating the collision shape after the anchor is detected, but nothing that I have tried works. I can't tap on the floor with gestures. I can't drop entities onto the floor. I can't seem to do ANYTHING at all with this floor anchor other than place entity at the totally arbitrary location somewhere on the floor.
Is there anyway at all with Spatial Tracking Session and Anchor Entity to get the actual plane that was detected?
struct FloorExample: View {
@State var trackingSession: SpatialTrackingSession = SpatialTrackingSession()
@State var subject: Entity?
@State var floor: AnchorEntity?
var body: some View {
RealityView { content, attachments in
let session = SpatialTrackingSession()
let configuration = SpatialTrackingSession.Configuration(tracking: [.plane])
_ = await session.run(configuration)
self.trackingSession = session
let floorAnchor = AnchorEntity(.plane(.horizontal, classification: .floor, minimumBounds: SIMD2(x: 0.1, y: 0.1)))
floorAnchor.anchoring.physicsSimulation = .none
floorAnchor.name = "FloorAnchorEntity"
floorAnchor.components.set(InputTargetComponent())
floorAnchor.components.set(CollisionComponent(shapes: .init()))
content.add(floorAnchor)
self.floor = floorAnchor
// This is just here to let me see where visinoOS decided to "place" the floor anchor.
let floorPlaced = ModelEntity(
mesh: .generateSphere(radius: 0.1),
materials: [SimpleMaterial(color: .black, isMetallic: false)])
floorAnchor.addChild(floorPlaced)
if let scene = try? await Entity(named: "AnchorLabsFloor", in: realityKitContentBundle) {
content.add(scene)
if let subject = scene.findEntity(named: "StepSphereRed") {
self.subject = subject
}
// I can see when the anchor is added
_ = content.subscribe(to: SceneEvents.AnchoredStateChanged.self) { event in
event.anchor.generateCollisionShapes(recursive: true) // this doesn't seem to work
print("**anchor changed** \(event)")
print("**anchor** \(event.anchor)")
}
// place the reset button near the user
if let panel = attachments.entity(for: "Panel") {
panel.position = [0, 1, -0.5]
content.add(panel)
}
}
} update: { content, attachments in
} attachments: {
Attachment(id: "Panel", {
Button(action: {
print("**button pressed**")
if let subject = self.subject {
subject.position = [-0.5, 1.5, -1.5]
// Remove the physics body and assign a new one - hack to remove momentum
if let physics = subject.components[PhysicsBodyComponent.self] {
subject.components.remove(PhysicsBodyComponent.self)
subject.components.set(physics)
}
}
}, label: {
Text("Reset Sphere")
})
})
}
}
}
Subject: Combining ARKit Face Tracking with High-Resolution AVCapture and Perspective Rendering on Front Camera
Message:
Hello Apple Developer Community,
We’re developing an application using the front camera that requires both real-time ARKit face tracking/guidance and the capture of high-resolution still images via AVCaptureSession. Our goal is to leverage ARKit’s depth and face data to render a captured image from another perspective post-capture, maintaining high image quality.
Our Approach:
Real-Time ARKit Guidance:
Utilize ARKit (e.g., ARFaceTrackingConfiguration) for continuous face tracking, depth, and scene understanding to guide the user in real time.
High-Resolution Capture Transition:
At the moment of capture, we plan to pause the ARKit session and switch to an AVCaptureSession to take a high-resolution image.
We assume that for a front-facing image, the subject’s face is directly front-on, and the relative pose between the face and camera remains the same during the transition. The only variation we expect is a change in distance.
Our intention is to minimize the delay between the last ARKit frame and the high-res capture to maintain temporal consistency, assuming that aside from distance, the face-camera relative pose remains unchanged.
Post-Processing Perspective Rendering:
Using the last ARKit face data (depth, pose, and landmarks) along with the high-resolution 2D image, we aim to render the scene from another perspective.
We want to correct the perspective of the 2D image using SceneKit or RealityKit, leveraging the collected ARKit scene information to achieve a natural, high-quality rendering from a different viewpoint.
The rendering should match the quality of a normally captured high-resolution image, adjusting for the difference in distance while using the stored ARKit data to correct perspective.
Our Questions:
Session Transition Best Practices:
What are the recommended best practices to seamlessly pause ARKit and switch to a high-resolution AVCapture session on the front camera
How can we minimize user movement or other issues during this brief transition, given our assumption that the face-camera pose remains largely consistent except for distance changes?
Data Integration for Perspective Rendering:
How can we effectively integrate stored ARKit face, depth, and pose data with the high-res image to perform accurate perspective correction or rendering from another viewpoint?
Given that we assume the relative pose is constant except for distance, are there strategies or APIs to leverage this assumption for simplifying the perspective transformation?
Perspective Correction with SceneKit/RealityKit:
What techniques or workflows using SceneKit or RealityKit are recommended for correcting the perspective of a captured 2D image based on ARKit scene data?
How can we use these frameworks to render the high-resolution image from an alternative perspective, while maintaining image quality and fidelity?
4. Pitfalls and Guidelines:
What common pitfalls should we be aware of when combining ARKit tracking data with high-res capture and post-processing for perspective rendering?
Are there performance considerations, recommended thresholds for acceptable temporal consistency, or validation techniques to ensure the ARKit data remains applicable at the moment of high-res capture?
We appreciate any advice, sample code references, or documentation pointers that could assist us in implementing this workflow effectively.
Thank you!
Hello,
I was looking back into downloading the Tracking geographic locations in AR sample app from https://developer.apple.com/documentation/arkit/tracking-geographic-locations-in-ar
Unfortunately the Download links to the .zip of the DisplayingAPointCloudUsingSceneDepth sample project.
The exact same issue occurs when trying to download the sample code from https://developer.apple.com/documentation/ARKit/creating-a-fog-effect-using-scene-depth
Wondering if those links are deliberately broken because of possible deprecations.
Thanks to any Apple Engineer willing to look into that.
Hi folks, I’m new to Vision Pro stack, still trying to learn all the nuances. Here is a problem I can’t seem to find an answer.
I placed entity A( a small .02 radius sphere) inside entity B( size:.1 box). Both entities have HoverEffectComponent, and both inputcomponent is set to .direct. Entity A is NOT a child of Entity B. When I direct touch Entity B, I noticed that Entity A’s hover effect is fired as well. This only happens if Entity A‘s position is inside Entity B. The gesture that is only targeted at Entity A doesn’t work either. I double checked Entity A collider which sits inside entity B collider, my direct touch shouldn’t have trigger its hove effect. Having one collider inside another seems to produce unpredictable behavior? Thanks in advance 🙏🙏🙏
Context: I’m trying to create an invisible bound around Entity A, so when my hand approaches the bound to grab Entity A, a nice spotlight hover effect would fire first on the bound before hand reaching entity A.
I am developing an ARKit based application that requires plane detection of the tabletop at which the user is seated. Early testing was with an iPhone 8 and iPhone 8+. With those devices, ARKit rapidly detected the plane of the tabletop when it was only 8 to 10 inches away. Using iPhone 15 with the same code, it seems to require me to move the phone more like 15 to 16 inches away before detecting the plane of the table. This is an awkward motion for a user seated at a table. To validate that it was not necessarily a feature of my code, I determined that the same behavior results with Apple's sample AR Interaction application. Has anyone else experienced this, and if so, have suggestions to improve the situation?
Hello,
I have downloaded and run the sample object tracking app for visionos.
Now I'm working on my own objects for tracking. I have made a model using Create ML using images of my object.
However, I cannot see how to convert the Create ML output file (xxx.mlmodel) into a reference object like the files in the sample project.
is there a tool for converting them?
TIA
Topic:
Spatial Computing
SubTopic:
ARKit
As I understand it there are two ways I can track a hand, or a joint, in RealityKit:
either, create an AnchorEntity, for example AnchorEntity(.hand(.left, location: .palm))
or, set up an ARSession with a HandTrackingProvider ( a lot more code which I haven't repeated here).
Assuming this is correct, when would I want to use one over the other?
We applied for the visionOS enterprise permission license, which can help us improve object tracking capabilities on Vision Pro. However, we are unsure how to use it in Unity, specifically how to implement object tracking in Unity and increase the tracking speed.
In ARKit for visionOS, I can track the user's head with a HeadAnchor, but it will not give the location. However, I can get the device's transform by calling queryDeviceAnchor(atTimestamp: CACurrentMediaTime()) on a WorldTrackingProvider.
Why the difference? - if I know the device's transform, I effectively know the head's transform.
I am using Entity of RealityKit to display virtual content, however I find that sometimes the real object in front of the virtual content can not occulude the virtual content.
For example, I place an Entity in a room, but when I walk into another room, I can still see the Entity through the wall.
I wonder how should I fix the problem. Thank you!
I am currently creating an app where two people share an instance of an immersive space so that they are able to point to certain things in the immersive space. Right now, other people are hidden behind the immersive space, and even with people awareness enabled for everything, people are still too difficult to see. I've found this documentation (https://developer.apple.com/documentation/arkit/occluding-virtual-content-with-people) which describes what I want to do, but it is only listed as working on iOS an iPadOS. Is there anything similar to this that will work on VisionOS?
Is it possible to detect distance from the vision pro to real live objects and people? I tried using scene.raycast to perform a raycast forward from the center of the viewport, but it doesn't seem to react to real life objects, only entities.
I see mentioned here: https://developer.apple.com/forums/thread/776807?answerId=829576022#829576022, that a raycast with scene reconstruction should allow me to measure that distance, as long as the object is non-moving. How could I accomplish that?