In ARKit for visionOS, I can track the user's head with a HeadAnchor, but it will not give the location. However, I can get the device's transform by calling queryDeviceAnchor(atTimestamp: CACurrentMediaTime()) on a WorldTrackingProvider.
Why the difference? - if I know the device's transform, I effectively know the head's transform.
Discuss spatial computing on Apple platforms and how to design and build an entirely new universe of apps and games for Apple Vision Pro.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
I have been concentrating on developing the visionOS application. While I am currently quite familiar with RealityKit, CompositorServices has also captured my attention. I have not yet acquired knowledge of CompositorServices. Could you please clarify whether it is essential for me to learn CompositorServices? Additionally, I would appreciate it if you could provide insights into the advantages of RealityKit and CompositorServices.
I'm trying to run a PhotogrammetrySession based on photos taken in an AVCaptureSession and stored as .heic files.
When I load the files I'm always seeing the error "Sample 0 missing LiDAR point cloud!" showing up for each individual sample.
Debugging shows that sample.depthDataMap is populated, also the .heic contains depth data which can be extracted using e.g. heif-convert on my Mac.
Comparing the .heic I created to one of the ObjectCaptureSession which doesn't show the LiDAR warning, I noticed the only difference being the HEIC information here:
So my questions are:
Are these the missing information in my manual capture causing this warning?
Can I somehow add these information in an AVCaptureSession?
Do these information allow better photogrammetry results?
We're trying to switch from using main camera access on Arkit to screen-capture with passthrough however we're facing some issues and it seems a bit complicated to debug.
We have set up a broadcast Extension, set up some logs on the sample Handler but we get nothing in the console nor that the recording starts, we set up the picker as well and we can see our extension in the control center as one of the choices but clicking start, results in it stopping in less than one second after.
The only message that is rather contradictory we see in the console.app is the following
[INFO] -[RPRecordingManager getSystemBroadcastExtensionInfo:]_block_invoke:1333 Extension has passthrough license
and just right after
[INFO] -[RPRecordingManager getSystemBroadcastExtensionInfo:]_block_invoke:1336 Extension does not have passthrough license
Goal: To render in an apple vision pro app, the solid-mechanics 3D simulation results coming form an FEA code.
Starting point: I have surface vtks with deformations on each node. Each time step has a a mesh with the nodal coordinates. This is straighforward translatable to a usd MeshSequence. Unfortunately, the results cannot be simplified to a scaling o linear transformation as you would do with other game-oriented animations.
Tools: Right now, I am using Xcode and reality composer pro (RCP) to build the scenes.
Technical limitations: I am aware that RCP can do animations with BlendMesh and skeletons and that MeshSequence is not a problem.
Progress:
Coverting to the sequence of vtk meshes to a usd MeshSequence is straighforward. This animates correctly in Preview and Blender (see screenshot).
I managed to convert from MeshSequence to multiple keys and BlendMesh. This also animates correctly in Blender and preview. Unfortunately, the BlendMesh of multiple blended meshes shows a zero animation time in RCP (see screenshot below)
Also, see below usda file scheme for the animation. Of course I am not showing full vectors such as faceVertexCounts, faceVertexIndex, normals.
Question: what is the right set up to create a BlendMesh animation that RCP will correctly import and animate, form a set of Meshes or multiple key shapes?
Blender animation
Time zero RCP "animations"
#usda 1.0
(
defaultPrim = "BlendMeshRoot"
doc = "Blender v4.5.3 LTS"
endTimeCode = 48
framesPerSecond = 24
metersPerUnit = 1
startTimeCode = 0
timeCodesPerSecond = 24
upAxis = "Z"
)
def Xform "BlendMeshRoot" (
customData = {
dictionary Blender = {
bool generated = 1
}
}
)
{
def SkelRoot "Mesh"
{
custom string userProperties:blender:object_name = "Mesh"
float3 xformOp:rotateXYZ = (89.99999, -0, 0)
float3 xformOp:scale = (0.009999999, 0.01, 0.01)
double3 xformOp:translate = (0, 0, 0)
uniform token[] xformOpOrder = ["xformOp:translate", "xformOp:rotateXYZ", "xformOp:scale"]
def Mesh "Mesh" (
active = true
prepend apiSchemas = ["MaterialBindingAPI", "SkelBindingAPI"]
)
{
uniform bool doubleSided = 1
float3[] extent = [(25.091871, -34.121277, -13.298501), (299.94482, 245.10088, 202.35126)]
int[] faceVertexCounts = [3, 3, ...
int[] faceVertexIndices = [0, 10293, ...
rel material:binding = </BlendMeshRoot/_materials/MeshSequence_Default>
normal3f[] normals = [(-0.3632836, -0.9102419, -0.19870725), ....
point3f[] points = [(244.41148, 155.42062, 70.454926),.....
float3[] primvars:node_displacement = [(93.54703, 110.9341, 48.37992)....
float3[] primvars:Normals = [(-0.0050530406, -0.9910114, -0.13368203),...
int[] primvars:skel:jointIndices = [0, 0, 0, 0, 0 ...
float[] primvars:skel:jointWeights = [1, 1, 1, 1, 1...
uniform token[] skel:blendShapes = ["frame_0000", "frame_0001", "frame_0002", "frame_0003", "frame_0004", "frame_0005"]
rel skel:blendShapeTargets = [
</BlendMeshRoot/Mesh/Mesh/frame_0000>,
.......
</BlendMeshRoot/Mesh/Mesh/frame_0005>,
]
prepend rel skel:skeleton = </BlendMeshRoot/Mesh/Skel>
uniform token subdivisionScheme = "none"
custom string userProperties:blender:data_name = "Mesh"
custom float userProperties:originalTime
float userProperties:originalTime.timeSamples = {
0: 0,
}
def BlendShape "frame_0000"
{
uniform vector3f[] offsets = [(0, 0, 0), (0, 0, 0),.....
uniform int[] pointIndices = [0, 1, 2, .....
}
.....
.....
#### BlendShape frame to 0005
.....
def Skeleton "Skel" (
prepend apiSchemas = ["SkelBindingAPI"]
)
{
uniform matrix4d[] bindTransforms = [( (1, 0, 0, 0), (0, 1, 0, 0), (0, 0, 1, 0), (0, 0, 0, 1) )]
uniform token[] joints = ["joint1"]
uniform matrix4d[] restTransforms = [( (1, 0, 0, 0), (0, 1, 0, 0), (0, 0, 1, 0), (0, 0, 0, 1) )]
prepend rel skel:animationSource = </BlendMeshRoot/Mesh/Skel/Anim>
def SkelAnimation "Anim"
{
uniform token[] blendShapes = ["frame_0000", "frame_0001", "frame_0002", "frame_0003", "frame_0004", "frame_0005"]
float[] blendShapeWeights.timeSamples = {
0: [1, 0, 0, 0, 0, 0],
1: [0.9697085, 0.03029152, 0, 0, 0, 0],
2: [0.88787615, 0.11212383, 0, 0, 0, 0],
.....
46: [0, 0, 0, 0, 0.11212379, 0.8878762],
47: [0, 0, 0, 0, 0.030291557, 0.96970844],
48: [0, 0, 0, 0, 0, 1],
}
}
}
}
def Scope "_materials"
{
def Material "MeshSequence_Default"
{
token outputs:surface.connect = </BlendMeshRoot/_materials/MeshSequence_Default/Principled_BSDF.outputs:surface>
custom string userProperties:blender:data_name = "MeshSequence_Default"
def Shader "Principled_BSDF"
{
uniform token info:id = "UsdPreviewSurface"
float inputs:clearcoat = 0
float inputs:clearcoatRoughness = 0.03
color3f inputs:diffuseColor = (0.8, 0.4, 0.3)
float inputs:ior = 1.5
float inputs:metallic = 0
float inputs:opacity = 1
float inputs:roughness = 0.5
float inputs:specular = 0.2
token outputs:surface
}
}
}
def Scope "AnimationClips"
{
custom rel animations = </BlendMeshRoot/Mesh/Skel/Anim>
}
def RealityKitComponent "AnimationLibrary"
{
custom rel animations = </BlendMeshRoot/Mesh/Skel/Anim>
custom token info:id = "RealityKit.AnimationLibrary"
custom double realitykit:approximateDuration = 2
custom double[] realitykit:clipDurations = [2]
custom string[] realitykit:clipNames = ["Anim"]
custom rel realitykit:clipTargets = </BlendMeshRoot/Mesh/Skel/Anim>
custom double realitykit:frameRate = 24
custom bool realitykit:isAnimationLibrary = 1
}
}
Topic:
Spatial Computing
SubTopic:
Reality Composer Pro
Tags:
Swift Packages
Developer Tools
Reality Converter
Reality Composer
Hi, I'm playing now with hand tracking. I want to get position of hand inside a system update function. I was not sure if transform I'm getting from hand attached AnchorEntity (with trackingMode: .predicted) would give same results as handAnchors(at:) from hand tracking provider, so I started to read them both and compare. For handAnchors i tried using context.scene.timebase.sourceTimebase!.sourceClock!.time.seconds and CACurrentMediaTime() as timestamp source. They seem to use exactly same clock, so that doesn't matter, but:
for some reason update handler is always called twice with same context.deltaTime, but first time the query finds 0 entities, second time it finds them all. The query is the standard EntityQuery(where: .has(MyComponent.self)) and in update (matching: Self.query, updatingSystemWhen: .rendering). Here's part of logs:
System update called, entity count: 0, dt: 0.01000458374619484, absTime: 4654.222593541
System update called, entity count: 11, dt: 0.01000458374619484, absTime: 4654.22262525
System update called, entity count: 0, dt: 0.009999999776482582, absTime: 4654.249390875
System update called, entity count: 11, dt: 0.009999999776482582, absTime: 4654.249425
accounting for the double update calling I started to calculate time delta of absolute time between calls and they're most of the time much bigger, or much smaller than advertised by system's context.deltaTime, only sometimes they kind of match, for example:
system: (dt: 0.01000458374619484)
scene : (dt: 0.021419291667371) (absTime: 4654.222628125001)
and the very next call
system: (dt: 0.010009 166784584522)
scene : (dt: 0.0013097083328830195) (absTime: 4654.223937833334)
but sometimes
system: (dt: 0.009999999776482582)
scene : (dt: 0.009 112249999816413) (absTime: 4654.351299 166668)
Shouldn't those be more or less equal, or am I missing something?
In the end it seems that getting hand position from AnchorEntity and with handAnchors(at:) gives kind of same results, but at different time points, so I'd love to understand what's the correct way to use them and why time flows differently :).
--Edit--
P.S. Had to put spaces everywhere in logs between "9" and "1", otherwise post was blocked due to "sensitive content" :D
This is no longer highlighting my entity when looking at it:
RealityView { content
let hoverComponent = HoverEffectComponent(.spotlight(
HoverEffectComponent.SpotlightHoverEffectStyle(
color: .white, strength: 2.0
)
))
entity.components.set(hoverComponent)
The entity is in a window. The same code works in an immersive view.
Collision Component and Input type are set in RCP.
It's also stopped working on my published app (built under visionOS 2.x) using my visionOS 26 device.
If I use a 2.x simulator, it works.
Is this a bug or is there something I'm missing?
Thanks.
Hi,
I am in the process of implementing SharePlay into our app. The shared experience opens an Immersive Space and we set systemCoordinator.configuration.supportsGroupImmersiveSpace = true
Now visionOS establishes a shared coordinate space for the immersive space.
From the docs:
To achieve consistent positioning of RealityKit entities across multiple devices in an immersive space during a SharePlay session
There are cases where we want to position content in front of the user (independent of the shared session, and for each user individually). Normally to do that we use the transform retrieved via worldTrackingProvider.queryDeviceAnchor.originFromAnchorTransform
to position content in front of the user (plus some Z Offset and smooth interpolation).
This works fine in non-SharePlay instances and the device transform is where I would expect it to be but during the FaceTime call deviceAnchor.originFromAnchorTransform seems to use the shared origin of the immersive space and then I end up with a transform that might be offset.
Here is a video of the issue in action: https://streamable.com/205r2p
The blue rect is place using AnchorEntity(.head, trackingMode: .continuous). This works regardless of the call and the entity is always placed based on the head position.
The green rect is adjusted on every frame using the transform I get from worldTrackingProvider.queryDeviceAnchor. As you can see it's offset.
Is there any way I can query query this transform locally for the user during a FaceTime call?
Also I would like to know if it's possible to disable this automatic entity transform syncing behavior?
Setting entity.synchronization = nil results in the entity not showing up at all.
https://developer.apple.com/documentation/realitykit/synchronizationcomponent
Is SynchronizationComponent only relevant for the legacy MultiPeerConnectivity approach?
Thank you!
Hi I know it's possible to play equirectangular VR180 video either SBS or MV-HEVC. And for fisheye video, the only way I know is to convert it into an AIVU for playback.
Is there any way to directly play fisheye video using AVPlayer? Thanks a lot!
Hi I have a monitoring app, that will take input video from uvc and process it using Metal, and eventually get a MTLTexture.
The problem I'm facing is I have to convert MTLTexture to CGImage then call TextureResource.replace, which is super slow. Metal processing speed is same as input frame rate(50pfs), but MTLTexture -> CGImage -> TextureResource only got 7fps...
Is there any way I can make it faster?
Topic:
Spatial Computing
SubTopic:
General
Tags:
Media Player
Frameworks
Media Accessibility
Core Media
The landing page for visionOS 26 mentions
The Unified Coordinate Conversion API makes moving views and entities between scenes straightforward — even between views and ARKit accessory anchors.
This WWDC session very briefly shows a single example of using this, but with no context. For example, they discuss a way to tell the distance between a Model3D and an entity in a RealityView. But they don't provide any details for how they are referencing the entity (bolts in the slide).
The session used the BOT-anist example project that we saw in visionOS 2, but the version on in the Sample Code library has not been updated with these examples.
I was able to put together a simple example where we can get the position of a window relative to the world origin. It even updates when the user recenters.
struct Lab080: View {
@State private var posX: Float = 0
@State private var posY: Float = 0
@State private var posZ: Float = 0
var body: some View {
GeometryReader3D { geometry in
VStack {
Text("Unified Coordinate Conversion")
.font(.largeTitle)
.padding(24)
VStack {
Text("X: \(posX)")
Text("Y: \(posY)")
Text("Z: \(posZ)")
}
.font(.title)
.padding(24)
}
.onGeometryChange3D(for: Point3D.self) { proxy in try! proxy
.coordinateSpace3D()
.convert(value: Point3D.zero, to: .worldReference)
} action: { old, new in
posX = Float(new.x)
posY = Float(new.y)
posZ = Float(new.z)
}
}
}
}
This is all that I've been able to figure out so far. What other features are included in this new Unified Coordinate Conversion?
Can we use this to get the position of one window relative to another? Can we use this to get the position of a view in a window relative to an entity in a RealityView, for example in a Volume or Immersive Space? What else can Unified Coordinate Conversion do?
Are there documentation pages that I'm missing? I'm not sure what to search for. Are there any Sample projects that use these features? Any additional information would be very helpful.
Topic:
Spatial Computing
SubTopic:
General
I am using Entity of RealityKit to display virtual content, however I find that sometimes the real object in front of the virtual content can not occulude the virtual content.
For example, I place an Entity in a room, but when I walk into another room, I can still see the Entity through the wall.
I wonder how should I fix the problem. Thank you!
Hi everyone,
I’m building a visualization app for VisionPro that uses SharePlay and GroupActivities to explore datasets collaboratively.
I’ve successfully implemented the new SharedWorldAnchor feature, and everything works well with nearby, local participants.
However, I’m stuck on one point:
How can I share a world anchor with remote participants who join via FaceTime as spatial personas?
Apple’s demo app (where multiple users move a plane model around) seems to suggest that this is possible.
For context, I’m building an immersive app with Metal rendering.
Any guidance or examples would be greatly appreciated!
Thanks,
Jens
Hello,
I'm currently trying to make a collaborative app. But it just works only on Reality View, when I tried to use Compositor Layer like below, the personas disappeared.
ImmersiveSpace(id: "ImmersiveSpace-Metal") {
CompositorLayer(configuration: MetalLayerConfiguration()) { layerRenderer in
SpatialRenderer_InitAndRun(layerRenderer)
}
}
Is there any potential solution too see Personas in Metal view?
Thanks in advance!
Seeing this magical sand table, the unfolding and folding effects are similar to spreading out cards, which is very interesting. But I don't know how to achieve it. I want to see if there are any ways to achieve this effect and give some ideas. May I ask if this effect can be achieved under the existing API
I downloaded the file through Scoot, and when I remove VisionPro, the app will call the StreamDelegate method and return ". endEncountered". How can I solve this problem?
Thank you!
I've been struggling with this for far too long so I've decided to finally come here and see if anyone can point me to the documentation that I'm missing. I'm sure it's something so simple but I just can't figure it out.
I can SharePlay our test app with my brother (device to device) but when I open a volumetric window, it says "not shared" under it. I assume this will likely fix the video sharing problem we have as well. Everything else works so smooth but SharePlay has just been such a struggle for me. It's the last piece to the puzzle before we can put it on the App Store.
Hello, I am trying to develop an app that broadcasts what the user sees via Apple Vision Pro. I am a graduate student studying at the university.
And I have two problems,
If I want to use passthrough in screen capture (in VisionOS), do I have to join Apple Developer Enterprise Program to get Enterprise API?
and Can I buy Apple Developer Enterprise Program (Enterprise API) with my university account?
Have any of you been able to do this?
Thank you
So I am exporting a .usdc file from blender that already has some morph animations. The animations play well in blender but when I export I cannot seem to play them in RealityKit or RCP.
Entity.availableAnimations is an empty array.
Not of the child objects in the entity hierarchy has an animation library component with it.
Maybe I am exporting it wrong but I tried multiple combinations but doesn't seem to work.
Here are my export settings in blender
The original file I purchased is an FBX file that has the animation but when I try to directly get it in RealityConverter it doesn't seem to play animations.
Topic:
Spatial Computing
SubTopic:
General
Tags:
Reality Converter
RealityKit
Reality Composer Pro
visionOS
For the M2 Apple Vision Pro, there's "a general guideline, we recommend no more than 500 thousand triangles for an immersive scene, with 250 thousand for applications in the shared space." --https://developer.apple.com/videos/play/wwdc2024/10186/?time=147
Is there a revised recommendation for the M5 Apple Vision Pro?