Hi everyone, I'm developing a MR Vision Pro app where I’d like to anchor virtual objects (such as UI elements) around the user's arm. However, I’ve noticed that Vision Pro seems to mask out the area where the user’s real arm is, hiding virtual content in that region so that you see your real arm.
Is there a way to render virtual elements on the user's arm—so that it looks like the object is placed directly on the arm despite the real-world passthrough? I was hoping there might be a way to adjust the depth or behavior of this masked-out region. Any insights or workarounds would be greatly appreciated! Thanks :)
AR / VR
RSS for tagDiscuss augmented reality and virtual reality app capabilities.
Posts under AR / VR tag
107 Posts
Sort by:
Post
Replies
Boosts
Views
Activity
Hi!
I attempted to run a sample project for detecting human pose in photos, which can be found here:
https://developer.apple.com/documentation/vision/detecting-human-body-poses-in-3d-with-vision
The project works perfectly when run on my Macbook Pro M1, but it fails on Apple Vision Pro. After selecting the photo an endless loading screen is presented and the following output is produced in the console:
Failed to initialize 2D Detection Algorithm.
Failed to initialize 2D Pose Estimation Algorithm.
Failed to initialize algorithm modules
Network path is nil: (null)
Failed to initialize 2D Detection Algorithm.
Failed to initialize 2D Pose Estimation Algorithm.
Failed to initialize algorithm modules
Unable to perform the request: Error Domain=com.apple.Vision Code=9 "Async status object reported as failed but without an error" UserInfo={NSLocalizedDescription=Async status object reported as failed but without an error}.
de-activating session 70138 after timeout
It seems that VNDetectHumanBodyPose3DRequest is failing on Vision Pro for some reason. Are there any additional requirements for running Vision framework on VisionOS, that I might be missing?
Hi!
I attempted running a sample project for detecting human pose in 3D with vision framework, that can be found here: https://developer.apple.com/documentation/vision/detecting-human-body-poses-in-3d-with-vision.
It works perfectly on my Macbook Pro M1, but fails on Apple Vision Pro. After selecting a photo, an endless loading screen is displayed and the following message is produced in the console:
Failed to initialize 2D Detection Algorithm.
Failed to initialize 2D Pose Estimation Algorithm.
Failed to initialize algorithm modules
Network path is nil: (null)
Failed to initialize 2D Detection Algorithm.
Failed to initialize 2D Pose Estimation Algorithm.
Failed to initialize algorithm modules
Unable to perform the request: Error Domain=com.apple.Vision Code=9 "Async status object reported as failed but without an error" UserInfo={NSLocalizedDescription=Async status object reported as failed but without an error}.
de-activating session 70138 after timeout
Is human pose detection expected to work on VisionOS? Is there any special configuration required, that I might be missing?
When I've made an animated UDSZ, at what framerate will the animation be rendered in QuickLook? Is it the same across all devices? (iPhone, Apple Vision Pro, etc.) and viewing environments? (QuickLook, inside an ARView, etc.)
Suppose I export my file at 30fps and the device draws at 60fps, does the device interpolate between frames automatically, animate at a lower frame rate, or play it at twice the speed? What if it were 24fps?
My primary concern with understanding frame rates is a bit of trouble I've had making perfectly looping animations. There always seems to be the slightest stutter between iterations.
Thanks in advance for any insights you're able to provide!
Hi there,
I'm trying to merge the mesh anchor into a single mesh, but couldn't find any resources on this. Here is the code where I make the mesh from each mesh anchor, and assigned it to a model component with a shader graph material.
func run(_ sceneRec: SceneReconstructionProvider) async {
for await update in sceneRec.anchorUpdates {
switch update.event {
case .added, .updated:
// Get or create entity for this anchor
let anchorEntity = anchors[update.anchor.id] ?? {
let entity = ModelEntity()
root?.addChild(entity)
anchors[update.anchor.id] = entity
return entity
}()
// Remove any existing children
for child in anchorEntity.children {
child.removeFromParent()
}
// Generate the mesh from the anchor
guard let mesh = try? await MeshResource(from: update.anchor) else { return }
guard let shape = try? await ShapeResource.generateStaticMesh(from: update.anchor) else { continue }
print("Mesh added, vertices: \(update.anchor.geometry.vertices.count), bounds: \(mesh.bounds)")
// Get the material to use
var material: RealityKit.Material
if isMaterialLoaded, let loadedMaterial = self.shaderMaterial {
material = loadedMaterial
} else {
// Use a temporary material until the shader loads
var tempMaterial = UnlitMaterial()
tempMaterial.color = .init(tint: .purple.withAlphaComponent(0.5))
material = tempMaterial
}
await MainActor.run {
anchorEntity.components.set(ModelComponent(mesh: mesh, materials: [material]))
anchorEntity.setTransformMatrix(update.anchor.originFromAnchorTransform, relativeTo: nil)
// Add collision component with static flag - required for spatial interactions
anchorEntity.components.set(CollisionComponent(
shapes: [shape],
isStatic: true,
filter: .default
))
// Make entity interactive - enables spatial taps, drags, etc.
anchorEntity.components.set(InputTargetComponent())
let shadowComponent = GroundingShadowComponent(
castsShadow: true,
receivesShadow: true
)
anchorEntity.components.set(shadowComponent)
}
I then use a spatial tap gesture to set the position parameter in the shader graph material that creates a nice gradient from the tap position on the mesh to the rest of the mesh.
SpatialTapGesture()
.targetedToAnyEntity()
.onEnded { value in
let tappedEntity = value.entity
// Check if the tapped entity is a child of tracking.meshAnchors
if isChildOfMeshAnchors(entity: tappedEntity) {
// Get local position (in the entity's coordinate space)
let localPosition = value.location3D
// Convert to world position (scene coordinate space)
let worldPosition = value.convert(localPosition, from: .local, to: .scene)
print("Tapped mesh anchor at local position: \(localPosition)")
print("Tapped mesh anchor at world position: \(worldPosition)")
// Update the material parameter with the tap position
updateMaterialTapPosition(entity: tappedEntity, position: worldPosition)
} else {
print("Tapped entity is not a mesh anchor")
}
}
}
My issue is that because there are several mesh anchors, the gradient often gets cut off by the edge of the mesh generated from the mesh anchor as suppose to a nice continuous gradient across the entire scene reconstructed mesh I couldn't find any documentations on how to merge mesh from mesh anchors, any tips would be helpful! Thank you!
Hello,
I am developing a visionOS application and am interested in obtaining detailed data of users’ hands through ARKit, including but not limited to Transform and rotation angle. I have reviewed Happy Beem, but it appears to only introduce the method of identifying the user’s specific gestures.
Could you please advise on how to obtain the Transform and rotation angle of the user’s hand?
Thank you.
Hi there,
I’m building a workplace experience that requires using virtual desktop, is there a way to launch it in my code, so user doesn’t have to do it manually?
Thanks in advance!
I have a warning on my AR app Virtual Tags that since July does not show its camera information telling:
"ARCL.SceneLocationView implements focusItemsInRect: - caching for linear focus movement is limited as long as this view is on screen."
I tried inserting the function, but documentation does not explain what it should return and how to generate it. Is someone able to help me?
Thanks,
Hi folks, I’m new to Vision Pro stack, still trying to learn all the nuances. Here is a problem I can’t seem to find an answer.
I placed entity A( a small .02 radius sphere) inside entity B( size:.1 box). Both entities have HoverEffectComponent, and both inputcomponent is set to .direct. Entity A is NOT a child of Entity B. When I direct touch Entity B, I noticed that Entity A’s hover effect is fired as well. This only happens if Entity A‘s position is inside Entity B. The gesture that is only targeted at Entity A doesn’t work either. I double checked Entity A collider which sits inside entity B collider, my direct touch shouldn’t have trigger its hove effect. Having one collider inside another seems to produce unpredictable behavior? Thanks in advance 🙏🙏🙏
Context: I’m trying to create an invisible bound around Entity A, so when my hand approaches the bound to grab Entity A, a nice spotlight hover effect would fire first on the bound before hand reaching entity A.
Hello,
We've been working for months now on an App for the Vision Pro.
(it's been great btw!)
We already have an App in the App Store for iOS, and have been migrating our platform from the Microsoft Hololens 2 to the AVP:
https://apps.microsoft.com/detail/9NPPP031VHD1
We require the Main Camera access and already have gotten the Enterprise.license for development purposes.
Unfortunately, we cannot publish our Business App (which uses an Enterprise API) under the same Name/Bundle ID as our iOS App because it would conflict with our current Distribution Method.
We arrived at the conclusion that we need a new Enterprise.license under a different Bundle ID to create a new App for the Business Store.
Has anyone been in the same boat as us, and tried to publish to the Business Store while already having an App in the Public App Store under the same name?
We applied to get another license for distribution under another name (with "Pro" at the end), but it's been stuck in limbo for over a month now (probably because the new bundle ID doesn't have any track record).
Anyhow, thanks for any help, we're open to suggestions as to how to proceed!
my coworkers and i are guessing at what data defines an anchor. i tried searching but struggled to find anything helpful.
our best guess was a combination Triangular Irregular Networks (TIN), gps, magnetic compass direction and maybe elevation sensors.
is this documented anywhere? if not, can a definition or description be provided?
Description:
I'm developing an AR effect using SceneKit and applying a transparent material to a face mesh. However, I'm facing an issue where the front faces of the mesh overlap each other, causing incorrect rendering.
Problem:
The front faces of the mesh overlap with each other when transparency is applied.
This causes areas like the cheeks to be visible through the nose, even though they should be occluded.
Expected Behavior: The material should behave as if it were opaque to itself—that is, overlapping front faces should be occluded properly, while still allowing transparency for background elements.
Actual Behavior: The mesh renders its own front faces incorrectly, making parts of the face visible through others when they should be blocked.
What I Have Tried:
testMaterial.writesToDepthBuffer = true
testMaterial.readsFromDepthBuffer = true
Question:
👉 How can I prevent SceneKit's transparent material from rendering overlapping front faces?
👉 Is there a way to force SceneKit to treat its own mesh as opaque for itself while still being transparent to the background?
👉 Does SceneKit support a proper depth pre-pass or an equivalent to Unity’s ZWrite shaders to solve this issue?
Attached screenshots demonstrate the problem visually. Any help would be greatly appreciated! 🚀
We are working on one of the projects for Hackathon for which we need to access the 3D Models of Apple Watch, it will help in making our project more realistic and effective, I wanted to know if we can get access for the same.
In visionOS, there are existing modifiers that can completely conceal the hands. However, I am interested in learning how to achieve the effect of only one hand disappearing while the other hand remains visible.
.upperLimbVisibility(.hidden)
I’m working on an iOS app that needs to measure the area of planes or surfaces, like the length and width of objects, just like the Apple Measure app does. I’ve been exploring ARKit, but I’m curious if there are any APIs or techniques that can help automate the process of detecting and measuring planes.
Specifically, I’m looking for a way to automatically detect and measure planes (e.g., from a top-down view). For example: Measuring a box width and length. I have attached a screenshot and a video of the Apple Measure App doing it.
Does Apple provide any tools or APIs for this, or are there any best practices I should know about? I’d love to hear from anyone who’s tackled something similar.
Video: https://drive.google.com/file/d/1BxM7fIbFxsCsYwY7w8ZxIeq_4WTGkkwA/view?usp=drive_link
Hi,
We are trying to port our Unity app from other XR devices to Vision Pro. Thus it's way easier for us to use the Metal rendering layer, fully immersive. And to stay true to the platform, we want to keep the gaze/pinch interaction system.
But we just noticed that, unlike Polyspatial XR apps, VisionOS XR in Metal does not provide gaze info unless the user is actively pinching... Which forbids any attempt to give visual feedback on what they are looking at (buttons, etc).
Is this planned in Apple's roadmap ?
Thanks
In visionOS, once an immersive space is opened, the background color is solid black, is it possible to make this background transparent?
FYI, The Immersive spaces on visionOS uses Compositor Services for drawing 3D content.
I have recently started testing ARKit on an iPhone 16 Pro and I have noticed that the AutoFocus reaction on this device is much slower than other devices. For example, if I point the camera to a close object AutoFocus takes 4-5 seconds to stabilize, the focal length is adjusted very very slowly. In some cases (although this is rare) AutoFocus seems almost stuck and requires a bit of device movement to trigger.
This is quite problematic when using some ARKit features like Image and Object detection as the detection algorithms struggle with out-of-focus images.
This problem is limited to ARKit. AutoFocus is significantly more responsive when the standard AVFoundation Camera API is used.
This behavior is easy to reproduce with any of the ARKit samples like https://developer.apple.com/documentation/arkit/arkit_in_ios/content_anchors/tracking_and_visualizing_planes
Is anybody else experiencing this problem?
Hi team I don't know from this error has popped up any suggestions please
Thanks
Zippy Games
Suppose there was an immersiveSpace, and an Entity() being added to the space as child entity of the content. This entity is responsible for playing background music by calling prepareAudio, gaining a controller and play the music. (check the basic code below)
When it was playing music, a .plain window and an immersiveSpace are both presented. I believe this immersiveSpace is holding the handle of the controller so as long as immersiveSpace is open, the music won't stop.
However if I close the .plain window (by closing system-level close button), the music just stopped. But the immersiveSpace is still open. If right now I check the value of controller.isPlaying, it was still true. But you just cannot hear the music anymore.
To reproduce, simply open an visionOS template App project, selecting volume and full immersive, and replace some code inImmersiveView.swift with the code below. Also simply drag any .mp3 file and replace the AudioFileResource's name. And you could reproduce this bug.
RealityView { content in
// Add the initial RealityKit content
if let immersiveContentEntity = try? await Entity(named: "Immersive", in: realityKitContentBundle) {
content.add(immersiveContentEntity)
// Put skybox here. See example in World project available at
// https://developer.apple.com/
if let audioResource = try? await AudioFileResource(named: "anyMP3file.mp3") {
let ent = Entity()
immersiveContentEntity.addChild(ent)
let controller = ent.prepareAudio(audioResource)
controller.play()
}
}
}
I wonder why this happen? I mean how should I keep the music playing when I close the .plain window?
Thanks!