How to move a camera in immersive space and render its output on 2D window using RealityKit

I'm trying to develop an immersive visionOS app, which you can move an Entity having a PerspectiveCamera as its child in immersive space, and render the camera view on 2D window.

According to this thread, this seems to can be achieved using RealityRenderer. But when I added the scene entity loaded from realityKitContentBundle to realityRenderer.entities, I needed to clone all entities of the scene, otherwise all entities in the immersive space will disappear.

@Observable
@MainActor
final class OffscreenRenderModel { 
    private let renderer: RealityRenderer  
    private let colorTexture: MTLTexture
        
    init(scene: Entity) throws {
        renderer = try RealityRenderer()
        
        // If not clone entities in the scene, all entities in the immersive space will disappear
        renderer.entities.append(scene.clone(recursive: true))
        
        let camera = PerspectiveCamera()
        renderer.activeCamera = camera
        renderer.entities.append(camera)
        ...
    }
}

Is this the expected behavior? Or is there any other way to do this (move camera in immersive space and render its output on 2D window)?

Here is my sample code: https://github.com/TAATHub/RealityKitPerspectiveCamera

Answered by Vision Pro Engineer in 824733022

Hi @TAAT

I really like your use case! Super creative.

The behavior you're observing is expected. An entity can only have a single parent. Anytime you add an entity to another entity, it is removed from its parent. In this case, when you add scene to renderer it removes scene from the immersive space's content. Here's a focused code snippet to demonstrate this. Tapping the sphere adds it to reality renderer and a timer adds it back to the reality view's content a few seconds later.

import SwiftUI
import RealityKit

struct ImmersiveView: View {
    @State var sphere = ModelEntity()
    var body: some View {
        RealityView { content in
            sphere.components.set(
                ModelComponent(mesh: .generateSphere(radius: 0.2), materials: [SimpleMaterial(color: .red, isMetallic: false)])
            )
            
            sphere.generateCollisionShapes(recursive: false)
            sphere.components.set(InputTargetComponent())
            sphere.position = [0, 1.4, -1]
            content.add(sphere)
        }
        .gesture(TapGesture().targetedToAnyEntity().onEnded { _ in
            // Keep a reference to the reality view's content.
            let oldParent = sphere.parent

            guard let renderer = try? RealityRenderer() else { return }
            
            // Appending sphere to renderer.entities removes it from
            // the reality view's content.
            renderer.entities.append(sphere)

            Task { // Wait a few seconds then add the sphere back to the reality view's content.
                try? await Task.sleep(nanoseconds: 3_000_000_000)
                
                oldParent?.addChild(sphere)
            }
        })
    }
}

Cloning the entity is a reasonable solution. Alternatively you can load 2 copies of your RealityKitContent; one for the immersive view and another for the reality renderer.

My understanding is that those cameras are not relevant for visionOS development. Those are used on iOS and other platforms.

When I tried to do this last winter it seemed that the only answer was to move the world around the user, instead of moving the user around the world.

Here is an excerpt from from the code I came up with. This allows as user to tap on an entity and move to a new position. Sort of like "waypoint" teleportation that was common in VR games circa 2016-2017. This could be improved in lots of ways. For example, using SpatialEventGesture or SpatialTapGesture to get a more precise location.

struct Lab5017: View {
    @State var selected: String = "Tap Something"
    @State var sceneContent: Entity?
    @State var sceneContentPosition: SIMD3<Float> = [0,0,0]

    var tap: some Gesture {
        SpatialTapGesture()
            .targetedToAnyEntity()
            .onEnded { value in
                selected = value.entity.name

                // Calculate the vector from the origin to the tapped position
                let vectorToTap = value.entity.position

                // Normalize the vector to get a direction from the origin to the tapped position
                let direction = normalize(vectorToTap)

                // Calculate the distance (or magnitude) between the origin and the tapped position
                let distance = length(vectorToTap)

                // Calculate the new position by inverting the direction multiplied by the distance
                let newPosition = -direction * distance

                // Update sceneOffset's X and Z components, leave Y as it is
                sceneContentPosition.x = newPosition.x
                sceneContentPosition.z = newPosition.z

            }
    }

    var body: some View {
        RealityView { content, attachments in

            if let model = try? await Entity(named: "5017Move", in: realityKitContentBundle) {
                content.add(model)

                // Get the scene content and stash it in state
                if let floorParent = model.findEntity(named: "SceneContent") {
                    sceneContent = floorParent
                    sceneContentPosition = floorParent.position
                }

            }
            //Position the attachment somewhere we can see it
            if let attachmentEntity = attachments.entity(for: "SelectedLabel") {
                attachmentEntity.position = [0.8, 1.5, -2]
                attachmentEntity.scale = [5,5,5]
                content.add(attachmentEntity)
            }

        } update: { content, attachments in

            // Update the position of scene content anytime we get a new position
            sceneContent?.position = sceneContentPosition


        } attachments: {
            Attachment(id: "SelectedLabel") {
                Text(selected)
                    .font(.largeTitle)
                    .padding(18)
                    .background(.black)
                    .cornerRadius(12)
            }
        }
        .gesture(tap) // The floor child entities can receive input, so this gesture will fire when we tap them
    }
}

Hello @radicalappdev, thanks for your reply.

it seemed that the only answer was to move the world around the user, instead of moving the user around the world.

I understand that this is a way to move in space in visionOS, but what I'm trying to do is not to move main camera. I want to render texture on 2D window from a camera moving or fixed in space, and main camera stay in its position.

e.g. Fly a drone in space and render its camera view (See the movie in my sample code)

In the following thread, the answer says that RealityRenderer can render its scene to MTLTexture, and finally can be displayed on 2D window.

https://developer.apple.com/forums/thread/762238?answerId=801164022#801164022

So I think this approach can help me to achieve my goal, and I was able to do it to some extent, except for that I need to clone entities for RealityRenderer.

Accepted Answer

Hi @TAAT

I really like your use case! Super creative.

The behavior you're observing is expected. An entity can only have a single parent. Anytime you add an entity to another entity, it is removed from its parent. In this case, when you add scene to renderer it removes scene from the immersive space's content. Here's a focused code snippet to demonstrate this. Tapping the sphere adds it to reality renderer and a timer adds it back to the reality view's content a few seconds later.

import SwiftUI
import RealityKit

struct ImmersiveView: View {
    @State var sphere = ModelEntity()
    var body: some View {
        RealityView { content in
            sphere.components.set(
                ModelComponent(mesh: .generateSphere(radius: 0.2), materials: [SimpleMaterial(color: .red, isMetallic: false)])
            )
            
            sphere.generateCollisionShapes(recursive: false)
            sphere.components.set(InputTargetComponent())
            sphere.position = [0, 1.4, -1]
            content.add(sphere)
        }
        .gesture(TapGesture().targetedToAnyEntity().onEnded { _ in
            // Keep a reference to the reality view's content.
            let oldParent = sphere.parent

            guard let renderer = try? RealityRenderer() else { return }
            
            // Appending sphere to renderer.entities removes it from
            // the reality view's content.
            renderer.entities.append(sphere)

            Task { // Wait a few seconds then add the sphere back to the reality view's content.
                try? await Task.sleep(nanoseconds: 3_000_000_000)
                
                oldParent?.addChild(sphere)
            }
        })
    }
}

Cloning the entity is a reasonable solution. Alternatively you can load 2 copies of your RealityKitContent; one for the immersive view and another for the reality renderer.

Thanks for your reply and code snippet!

An entity can only have a single parent. Anytime you add an entity to another entity, it is removed from its parent.

Oh, I see. That's why the entity disappeared from immersive space when I added it to RealityRenderer. (Also found this doc indicating this)

Cloning the entity is a reasonable solution

Let me ask one more question. Although cloning the entity can achieve my use case, but it means that I have to manage and update states of those two sets of entities. Do you have a plan to make this use case easier, something like let RealityRenderer just refer to the given entity?

How to move a camera in immersive space and render its output on 2D window using RealityKit
 
 
Q