Hello,
Is there a way to always have the attachments of a RealityView always face the user?
For example, in a visionOS app, in an immersive space, we have an attachment. When the user either walks around the attachment, or rotates the parent entity, we would like the attachment to automatically rotate to face the user.
How do we do this?
I anticipated this to be a trivial feature to implement, since I thought I remembered seeing this feature as a built-in/opt-in option for attachments. But, I cannot find that feature.
All and any recommendations are appreciated, thanks.
General
RSS for tagDiscuss Spatial Computing on Apple Platforms.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
I'm starting my journey in developing an immersive app for VisionOS. I've been making steady progress, but I've encountered a specific challenge that I haven't been able to resolve.
I created two ModelEntity objects — a sphere and a cube — and added a DragGesture to the cube. When I drag the cube over the sphere, the two collide correctly, and the collision is logged in the console. So far, everything works as expected.
However, when I try to anchor the cube to my hand, the collision stops working. It's as if the cube loses its ability to detect collisions once it's anchored.
Any guidance or clarification on this behavior would be greatly appreciated.
// ImmersiveView.swift
// estudos_vision
//
// Created by Lailan Rogerio Rodrigues Matos on 15/05/25.
//
import SwiftUI
import RealityKit
import RealityKitContent
struct ImmersiveView: View {
@Environment(AppModel.self) var appModel
@State private var session: SpatialTrackingSession?
@State private var box = ModelEntity()
@State private var subs: [EventSubscription] = []
@State private var ballEntity: Entity?
var body: some View {
RealityView { content in
// Load initial content from the RealityKit scene.
if let immersiveContentEntity = try? await Entity(named: "Immersive", in: realityKitContentBundle) {
content.add(immersiveContentEntity)
}
// Create and run a spatial tracking session.
let session = SpatialTrackingSession()
let configuration = SpatialTrackingSession.Configuration(tracking: [.hand])
_ = await session.run(configuration)
self.session = session
// Create a red box.
let boxMesh = MeshResource.generateBox(size: 0.2)
let material = SimpleMaterial(color: .red, isMetallic: false)
box = ModelEntity(mesh: boxMesh, materials: [material])
box.position.y += 0.15 // Position the box slightly above the origin.
// Configure the box for user interaction and physics.
box.components.set(InputTargetComponent(allowedInputTypes: .indirect)) // Make it interactive.
box.generateCollisionShapes(recursive: false) // Generate collision shapes for physics.
box.components.set(PhysicsBodyComponent( // Add physics behavior.
massProperties: .default,
material: .default,
mode: .kinematic // Use kinematic mode so it can be moved by user interaction.
))
box.components.set(GroundingShadowComponent(castsShadow: true)) // Add a shadow.
//content.add(box) //commented out to add to hand anchor
// Create a left hand anchor and add the box as a child.
let handAnchor = AnchorEntity(.hand(.left, location: .palm), trackingMode: .continuous)
handAnchor.addChild(box)
content.add(handAnchor) // Add the hand anchor to the scene.
// Create a sphere.
let ball = ModelEntity(mesh: .generateSphere(radius: 0.15))
ball.position = [0.0, 1.5, -1.0] // Initial position of the ball.
ball.generateCollisionShapes(recursive: false) // Add collision.
ball.name = "Sphere"
content.add(ball)
ballEntity = ball
// Subscribe to collision events between the box and other entities.
let event = content.subscribe(to: CollisionEvents.Began.self, on: box) { ce in
print("Collision between \(ce.entityA.name) and \(ce.entityB.name) occurred")
//ce.entityA.removeFromParent() // removes the colliding object
//ce.entityB.removeFromParent()
}
Task {
subs.append(event)
}
}
// Add a drag gesture to the box, allowing the user to move it.
.gesture(
DragGesture()
.targetedToEntity(box) // Target the drag gesture to the box.
.onChanged({ value in
// Update the position of the box based on the drag gesture.
box.position = value.convert(value.location3D, from: .local, to: box.parent!)
})
)
}
}
#Preview(immersionStyle: .full) {
ImmersiveView()
.environment(AppModel())
}
Topic:
Spatial Computing
SubTopic:
General
Hi, I'm developing a virtual camera system using ReplayKit to capture scene video by directly accessing raw video buffers. The capture mechanism works flawlessly when repeatedly starting and stopping video capture within a continuous immersive environment. However, a critical issue arises when interrupting the immersive space:
Step 1: Enter immersive environment and start and stop capture videos(Multiple times with no issues)
Step 2: Press the crown button to exit the immersive environment
Step 3: Return to the immersive space subsequently
Step 4: Attempt to start the video capture
At this point, the startCapture method throws an unexpected error, disrupting the video capture workflow.
This is the Xcode error that I see " [ERROR] -[RPScreenRecorder startCaptureWithHandler:completionHandler:]_block_invoke_2:500 failed to start due to error: Error Domain=com.apple.ReplayKit.RPRecordingErrorDomain Code=-5803 "Recording failed to start" UserInfo={NSLocalizedDescription=Recording failed to start}"
I have tried all possible ways to stopCapture including OnDisappear and other methods and nothing seems to solve this.
I am developing with Apple Vision Pro to implement object tracking functionality, but each model needs to go into Create ML for training, and the training time is very long. Are there other ways to shorten training time while obtaining reference files in the same format?
Additionally, can the delay in object tracking be further optimized? Although the refresh rate has been optimized, there is still a noticeable delay.
Hi,
We are trying to port our Unity app from other XR devices to Vision Pro. Thus it's way easier for us to use the Metal rendering layer, fully immersive. And to stay true to the platform, we want to keep the gaze/pinch interaction system.
But we just noticed that, unlike Polyspatial XR apps, VisionOS XR in Metal does not provide gaze info unless the user is actively pinching... Which forbids any attempt to give visual feedback on what they are looking at (buttons, etc).
Is this planned in Apple's roadmap ?
Thanks
I am experimenting with RealityKit to set up a portal. Everything works, but I was wondering where the scene's origin is with respect to the front of the portal window?
From experiments, the origin's X and Y appear to be at the center of the portal window, while the origin's Z appearing to be about a meter behind the portal window.
Is this (at least roughly) correct? Is it documented anywhere?
PS. I began with the standard visionOS app and edited the Reality Composer Pro file to create the scene.
Hi !
I'm new on this forum, so if I need to update this post to have more info, or anything else, please let me know.
I'm using the Apple Vision Pro to develop some app (with unity). To demonstrate what the user see on the headset, I would like to mirror the view on a device (an iPad in this case). I managed to do this without any issue.
My problem is that, in the Vision Pro, I have an interface that the user can interact with. But I would like to be able to manage myself the interface on the iPad. What I mean is that the user can (or can't, doesn't matter) see the interface in the headset, and the interface is controlled by myself on the iPad.
Is there any way to do this ? Is this a question I should ask on unity's forum ? (I don't think so, because it should be related to the mirroring function non ?)
Topic:
Spatial Computing
SubTopic:
General
Hi,
I'm developing an app for the Apple Vision Pro.
Inside the app the user should be able to load objects from the web into the scene and then be able to move them around (dragging and rotating) via gestures.
My question:
I'm working with RealityKit and use RealityView..
I have no issues loading in one object and making it interactive by adding gestures to the entire RealityView via the .gestures() function.
Also I succeeded in loading multiple objects into the scene.
My problem is that I can't figure out how to add my gestures to multiple objects independently.
I can't use Reality composer since I'm loading the objects dynamically into the scene.
Using .gestures() doesn't work for multiple objects since every gesture needs to be targeted to a specific entity but I have multiple entities.
I also tried defining a GestureComponent and adding it to every newly loaded entity but that doesn't seem to work as nothing happens,
even though my gesture is targeted to every entity having my GestureComponent.
I only found solutions that are not usable on Visionos / RealityView like installGestures.
Also I tried following this guide: https://developer.apple.com/documentation/realitykit/transforming-realitykit-entities-with-gestures
But I feel like there are things missing and contradictory, like a extension of RealityView is mentioned but not shown and the guide states that we don't need to store the translation values for every entity and therefore creates a EntityGestureState.swift file to store these values but then it's not used and a GestureStateComponent is used instead that was never mentioned and contradicts what was just said about not storing values per entity because a new instance of it is created for every entity. Nevermind, following the guide didn't lead me to a working solution.
I implemented a ShaderGraphMaterial and tried to load it from my usda scene by ShaderGraphMaterial.init(name: in: bundle). I want to dynamically set TextureResource on that material, so I wanted to expose texture as Uniform Input of a ShaderGraphMaterial. But obviously RCP's Shader Graph doesn't support Texture input as parameter as the image shows:
And from the code level, ShaderGraphMaterial also didn't expose a way to set TexturesResources neither. Its parameterNames shows an empty array if I didn't set any custom input params. The texture I get is from my backend so it really cannot be saved into a file and load it again (that would be too weird).
Is there something I am missing?
As in the title: openImmersiveSpace works as expected. The ImmersiveSpace in the ID opens normally but the function never resolves a Result. Here is a workaround I used to make user it was on the UI thread and scenePhase was active:
`
@MainActor func openSpaceWithStateCheck() async {
if scenePhase == .active {
Task {
switch await openImmersiveSpace(id: "RoomCaptureInteraction") {
case .opened:
isCapturingImagery = true
break
case .error:
print("!! An error occurred when trying to open the immersive space captureRoomImagery")
case .userCancelled:
print("!! The user declined opening immersive space captureRoomImagery")
@unknown default:
print("!! unknown default result of opening space")
break
}
}
} else {
print("Scene not active, deferring immersive space opening")
}
}
I'm on visionOS 2.4 and SDK 2.2.
I have tried uninstalling the app and rebuilding. Tried simply opening an empty ImmersiveSpace.
The consistency of the ImmersiveSpace opening at least means I can work around it. Even dismissImmersiveSpace works normally and closes the immersive space. But a workaround seems hamfisted.
I am wondering, is it possible to somehow configure a 3D object to respond to the gaze of a person, like change colors of some parts of the 3D-Model where a person is looking, i.e. where a person's gaze lands on the surface of the 3D-model ?
For example, if there is a 3D model of a Cool Dragon 🐉 in the physical space of a person, when seen through with the mixed reality view, of a VisionPro. Now, it would be really cool to change only the color, or make some specific parts of the dragon skin shimmer, but only in the areas where a person is looking. Is this possible ? Is it do-able with eye-tracking of VisionPro ?
Any advice would be appreciated. 🤝 🤓 I am new to iOS and VisionOS development.
Topic:
Spatial Computing
SubTopic:
General
I have a grpc server running inside of a task. When the user takes the headset off, the grpc server will no longer work when they put the headset back on.
I would like to have this action detected so that I can cancel the task (which will effectively close the grpc server).
I am also using a visual indicator to let the user know if the server is running, but it will not accurately reflect the state of the server when removing and putting back on the headset.
Topic:
Spatial Computing
SubTopic:
General
We would like to create an Immersive video and store the video file locally in Vision Pro for viewing.
By Immersive video, I mean the video that is played at the end of the Vision Pro experience at the Apple Store (LeBron's dunk, Curry's 3-point shot, tightrope walk, etc.). It is unclear if a way is currently provided to view Immersive video locally.
I can find some information about Spatial video on the Dev site, but I can't find any information about Immersive video. My understanding is:
Spatial video:
A video window appears in space and plays video with depth. Up to 4K side-by-side video can be converted to MV-HEVC format using Xcode and played back in the Photos app.
Immersive video:
180VR video, but I’m not sure how it was created. Similar to Spatial video, I converted a side-by-side 180VR video to MV-HEVC format using Xcode, but it could not be played back in the Photos app as expected.
Vision Pro's Photos app features an Immersive button during video playback, but this appears to be for zooming in on Spatial video to the full field of view, which seems different from Immersive video.
The demo video provided by Apple is streamed from Apple TV, and there are no local files available.
We are currently considering creating an app that displays different videos to each eye, but we prefer not to go this route due to licensing and distribution issues.
Topic:
Spatial Computing
SubTopic:
General
Seeing this magical sand table, the unfolding and folding effects are similar to spreading out cards, which is very interesting. But I don't know how to achieve it. I want to see if there are any ways to achieve this effect and give some ideas. May I ask if this effect can be achieved under the existing API
Topic:
Spatial Computing
SubTopic:
General
Hi I am trying to implement something simple as people can share their Spatial Photos with others (just like this post). I encountered the same issue with him, but his answer doesn't help me out here.
Briefly speaking, I am using CGImgaeSoruce to extract paired leftImage and rightImage from one fetched spatial photo
let photos = PHAsset.fetchAssets(with: .image, options: nil)
// enumerating photos ....
if asset.mediaSubtypes.contains(PHAssetMediaSubtype.spatialMedia) {
spatialAsset = asset
}
// other code show below
I can fetch left and right images from native Spatial Photo (taken by Apple Vision Pro or iPhone 15+), but it didn't work on generated spatial photo (2D -> 3D feat in Photos).
// imageCount is 1 when it comes to generated spatial photo
let imageCount = CGImageSourceGetCount(source)
I searched over the net and someone says the generated version is having a depth image instead of left/right pair. But still I cannot extract any depth image from imageSource.
The full code below, the imagePair extraction will stop at "no groups found":
func extractPairedImage(phAsset: PHAsset, completion: @escaping (StereoImagePair?) -> Void) {
let options = PHImageRequestOptions()
options.isNetworkAccessAllowed = true
options.deliveryMode = .highQualityFormat
options.resizeMode = .none
options.version = .original
return PHImageManager.default().requestImageDataAndOrientation(for: phAsset, options: options) {
imageData, _, _, _ in
guard let imageData,
let imageSource = CGImageSourceCreateWithData(imageData as CFData, nil)
else {
completion(nil)
return
}
let stereoImagePair = stereoImagePair(from: imageSource)
completion(stereoImagePair)
}
}
}
func stereoImagePair(from source: CGImageSource) -> StereoImagePair? {
guard let properties = CGImageSourceCopyProperties(source, nil) as? [CFString: Any] else {
return nil
}
let imageCount = CGImageSourceGetCount(source)
print(String(format: "%d images found", imageCount))
guard let groups = properties[kCGImagePropertyGroups] as? [[CFString: Any]] else {
/// function returns here
print("no groups found")
return nil
}
guard
let stereoGroup = groups.first(where: {
let groupType = $0[kCGImagePropertyGroupType] as! CFString
return groupType == kCGImagePropertyGroupTypeStereoPair
})
else {
return nil
}
guard let leftIndex = stereoGroup[kCGImagePropertyGroupImageIndexLeft] as? Int,
let rightIndex = stereoGroup[kCGImagePropertyGroupImageIndexRight] as? Int,
let leftImage = CGImageSourceCreateImageAtIndex(source, leftIndex, nil),
let rightImage = CGImageSourceCreateImageAtIndex(source, rightIndex, nil),
let leftProperties = CGImageSourceCopyPropertiesAtIndex(source, leftIndex, nil),
let rightProperties = CGImageSourceCopyPropertiesAtIndex(source, rightIndex, nil)
else {
return nil
}
return (leftImage, rightImage, self.identifier)
}
Any suggestion? Thanks
visionOS 2.4
Hi,
I created an app using iOS Object Capture API which works only on Lidar enabled phones. It's a limitation of the Api provided by apple itself.
I Submitted an app for Review , but It is getting rejected (Twice) saying it doesnt work on non pro models. Even though I explained that capturing Needs Lidar and supported only in PRO models, It still gets rejected after testing in Non Pro models. is there a way out?
Hello everyone,
I've been trying for a few weeks now to convert a sequential series of meshes into a stop-motion animation in USDZ format.
In Unreal Engine, I’ve already figured out how to transform the sequential series of individual meshes into a smooth animation using the node system and arrays.
Unfortunately, the node system cannot be exported as a usdz animation logic in either Unreal or Blender.
Because of this, I have tried several other methods to incorporate the animation logic. Here’s what I’ve tried so far:
I attempted to create the animation in Blender with Render-/Viewports and mapping it to keyframes. However, in my experience, Viewports are not supported in the conversion.
I tried aligning the vertices of individual objects and merging the frames using the Shrinkwrap modifier in Blender, then setting up a morph animation with keyframes. However, because the individual meshes are too different, this results in artifacts, and manually editing each mesh is too difficult for me to handle.
I placed all individual meshes at the same position and animated them sequentially by scaling them from 0 to 100 in keyframes (Frame 1 is visible for 10 frames, then scales down at frame 11, while Frame 2 becomes visible at frame 11, and so on). I also adjusted the keyframes so that the scaling happens in a "constant" manner rather than the default Bezier or linear interpolation. I then converted this animation to .abc, and the result initially looked good. However, some information is lost when converting it with OpenUSD. The animation does not maintain its intended jump-like behavior in USDZ format, and instead, the scaling of individual files is visible in the animation.
I tried using a Blender add-on (StepMotion), which allows the animation to be exported as .abc, but it can only be read in Blender or Unreal. Even in the preview, the animation is not displayed correctly, so converting the animation logic does not work either.
Unfortunately, I have no alternative way to create the animation, as the individual frames have been provided to me as meshes. So far, I haven’t found a way to implement this successfully.
I would be very grateful for any tips or ideas, as I am running out of options on how to make this work.
Thanks in advance!
Topic:
Spatial Computing
SubTopic:
General
Tags:
Core Animation
Reality Converter
Visual Design
USDZ
Is there a suitable
UTType type to satisfy the need to pick up only SpatialVideo in UIDocumentPickerViewController?
I already know that PHPickerFilter in PHPickerViewController can do this, but not in UIDocumentPickerViewController.
Our app needs to adapt both of these ways to pick spatial videos
So is there anything that I can try in UIDocumentPickerViewController to fulfill such picker functionality?
Topic:
Spatial Computing
SubTopic:
General
Tags:
Files and Storage
Photos and Imaging
PhotoKit
visionOS
Is it possible to create a button in my app that will turn on the spatial personas for the user? Currently the only way I know of turning on spatial personas is by selecting the cube icon in the FaceTime window which is quite clunky for people unfamiliar with the Vision Pro's UI. Any help would be appreciated.