How to get the floor plane with Spatial Tracking Session and Anchor Entity

In the WWDC session titled "Deep dive into volumes and immersive spaces", the developers discussed adding a Spatial Tracking Session and an Anchor Entity to detect the floor. They then glossed over some important details. They added a spatial tap gesture to let the user place content relative to the floor anchor, but they left a lot of information.

.gesture(
    SpatialTapGesture(
        coordinateSpace: .immersiveSpace
    )
    .targetedToAnyEntity()
    .onEnded { value in
        handleTapOnFloor(value: value)
    }
)

My understanding is that an entity has to have input and collision components for gestures like this to work. How can we add a collision to an AnchorEntity when we don't know its size or shape?

I've been trying for days to understand what is happening here and I just don't get it. It is even more frustrating that the example project that Apple released does not contain any of these features.

I would like to be able

  • Detect the floor plane
  • Get the position/transform of the floor plane
  • Add a collider to the floor plane
  • Enable collisions and physics on the floor plane
  • Enable gestures on the floor plane

It seems to me that the Anchor Entity is placed as an entirely arbitrary position. It has absolutely no relationship to the rectangle with the floor label that I can see in the Xcode visualization. It is just a point, not a plane or rect that I can use.

I've tried manually calculating the collision shape after the anchor is detected, but nothing that I have tried works. I can't tap on the floor with gestures. I can't drop entities onto the floor. I can't seem to do ANYTHING at all with this floor anchor other than place entity at the totally arbitrary location somewhere on the floor.

Is there anyway at all with Spatial Tracking Session and Anchor Entity to get the actual plane that was detected?

struct FloorExample: View {

    @State var trackingSession: SpatialTrackingSession = SpatialTrackingSession()
    @State var subject: Entity?
    @State var floor: AnchorEntity?

    var body: some View {
        RealityView { content, attachments in

            let session = SpatialTrackingSession()
            let configuration = SpatialTrackingSession.Configuration(tracking: [.plane])
            _ = await session.run(configuration)
            self.trackingSession = session

            let floorAnchor = AnchorEntity(.plane(.horizontal, classification: .floor, minimumBounds: SIMD2(x: 0.1, y: 0.1)))
            floorAnchor.anchoring.physicsSimulation = .none
            floorAnchor.name = "FloorAnchorEntity"
            floorAnchor.components.set(InputTargetComponent())
            floorAnchor.components.set(CollisionComponent(shapes: .init()))
            content.add(floorAnchor)
            self.floor = floorAnchor

            // This is just here to let me see where visinoOS decided to "place" the floor anchor.
            let floorPlaced = ModelEntity(
                mesh: .generateSphere(radius: 0.1),
                materials: [SimpleMaterial(color: .black, isMetallic: false)])
            floorAnchor.addChild(floorPlaced)

            if let scene = try? await Entity(named: "AnchorLabsFloor", in: realityKitContentBundle) {
                content.add(scene)

                if let subject = scene.findEntity(named: "StepSphereRed") {
                    self.subject = subject
                }

                // I can see when the anchor is added
                _ = content.subscribe(to: SceneEvents.AnchoredStateChanged.self)  { event in
                    event.anchor.generateCollisionShapes(recursive: true) //  this doesn't seem to work
                    print("**anchor changed** \(event)")
                    print("**anchor** \(event.anchor)")
                }

                // place the reset button near the user
                if let panel = attachments.entity(for: "Panel") {
                    panel.position = [0, 1, -0.5]
                    content.add(panel)
                }
            }

        } update: { content, attachments in

        } attachments: {
            Attachment(id: "Panel", {
                Button(action: {
                    print("**button pressed**")
                    if let subject = self.subject {
                        subject.position = [-0.5, 1.5, -1.5]
                        // Remove the physics body and assign a new one - hack to remove momentum
                        if let physics = subject.components[PhysicsBodyComponent.self] {
                            subject.components.remove(PhysicsBodyComponent.self)
                            subject.components.set(physics)
                        }
                    }
                }, label: {
                    Text("Reset Sphere")
                })
            })
        }
    }
}

Hi @radicalappdev,

A couple of notes on your code.

  • it's usually better to put the code that configures the SpatialTrackingSession into a class instead of an @State property on your view
  • anchors are not a great choice to attach collision shapes to
  • this CollisionComponent(shapes: .init()) is creating a collision component with an empty array

The third is the main reason you don't see events on your floor.

If you change that line to something like:

floorAnchor.components.set(CollisionComponent(shapes: [ShapeResource.generateBox(width: 30, height: 0.01, depth: 30)]))

You should get better results.

Please let me know how it goes.

I reworked it to remove the session code from the view and put it into the AppModel (from the template) and used the box for the shape resource and got taps via a tap gesture like this:

        .gesture(TapGesture()
            .targetedToAnyEntity()
            .onEnded({ _ in
                print("tapped")
            }))

@Vision Pro Engineer Hi, thanks for the response.

I have a few questions and responses

it's usually better to put the code that configures the SpatialTrackingSession into a class instead of an @State property on your view

Sure, I would do that in most apps. This is just an example where I was trying to keep everything in one file. Do you have any details on the why it is better to place SpatialTrackingSession in an observable class instead of state on a view? Several of the WWDC sessions and examples store the session in the view and I was using them as a starting point.

SpatialTapGesture Just to clarify, that was just an example from "Deep dive into volumes and immersive spaces" (WWDC 2024). They showed using the tap gesture on an anchor but didn't show how the collision was created. I wasn't using a gesture in my scene at all. I was just using this as an example of something that obviously needed a collision shape. But the session obscured the details.

this CollisionComponent(shapes: .init()) is creating a collision component with an empty array

Yes, I was trying to create a collision, the populate it later with a call to generateCollisionShapes

event.anchor.generateCollisionShapes(recursive: true) 

If I understand correctly this doesn't work because the AnchorEntity is a point on a plane, not a the plane it self. Is that correct?

Your hard coded ShapeResouce is interesting, but it doesn't help me create a collision shape that matches the physical floor in my office. This results in an arbitrary shape, positioned by a system I can't predict, to create a floor that may or may not cover the floor in the real room.

Is it possible to use an AnchorEntity (with SpatialTrackingSession) to get the plane/bounds/rect of the floor that visionOS detected? So far my guess is no. It seems like AnchorEntity is actually an arbitrary point/transform on that detected plane.

A real object surface has shape, size, and 6DoF (position & rotation). Once we developers can be getting such information, we can do much in AR/VR/MR/XR, spatial computing, robotics, digital twin, reverse engineering, CAD/CAM, etc.

What visionOS overlooked is that the position of a real object must be defined inside the object. visionOS places object (mesh, plane, etc.) anchors somewhere about the app origin.

The common algorithms for determining the 6DoF of an object plane are Hough transform (1961) and RANSAC (Random Sample Consensus, 1981). They use the plane equation “ax + by + cz + d = 0 with ||(a, b, c)|| =1”. (a, b, c) is the normal of the plane, d is the distance of the plane from the origin. One of the serious weaknesses is that the parameter “d” has practically nothing to do with the position of the object plane. It is a completely unknown reality. An object plane must be defined with normal and an interior point on the plane. The best position is the centroid of the measurement points. The size (i.e. the boundary) of the object plane can be determined by the measurement points. The Hough transform and RANSAC algorithms cannot realize this.

There must be a new solution algorithm. January 14, 2025.

Processing 3D point clouds is not a simple task, it is complicated and not yet fully understood.

Recognizing and accurately measuring shape, size and 6DoF in real time is the “Holly Gray” of 3D data processing.

How to get the floor plane with Spatial Tracking Session and Anchor Entity
 
 
Q