Can I get a point on a texture in realityKit on VisionOS?

Hi,

I am currently considering porting my AR game from SceneKit to RealityKit so it appears in 3D on Vision OS, but one crucial question to knonw if it can even be ported is: I need a tap on a 3D Model that is than translated into a tap onto the texture of that model.

On visionOS this would be the gaze of the person so the question is if I can get the point of a texture is user is looking at (optimally always to I can have a hover effect, but if not at least when tapping the finger together).

If that is not possible, is it possible to touch a reality kit object and get the location of that on the texture?

All the best Christoph

For privacy reasons there is no way to read the location of a user's gaze, but I still think you can accomplish your goals.

You can get the coordinates of a tap gesture (gaze and pinch) using SpatialTapGesture and apply a hover effect (fired on gaze on visionOS) to an Entity using HoverEffectComponent.

Here's a snippet that draws a sphere on a plane when you tap the plane and changes the plane's color (from gray) to blue when you hover on (gaze at) the plane.

RealityView { content in
    let plane = ModelEntity(mesh: .generatePlane(width: 0.5, height: 0.5),
                            materials: [UnlitMaterial(color: .gray)])

    // Build the collision shapes
    // Note: for more complex meshes you'll need something like:
    // https://developer.apple.com/documentation/realitykit/shaperesource/generateconvex(from:)-53jm9
    plane.generateCollisionShapes(recursive: true)
    guard let shapes = plane.collision?.shapes else {return}
    
    // Add these 2 components to the plane so it can receive taps
    plane.components.set(InputTargetComponent())
    plane.components.set(CollisionComponent(shapes: shapes))
    
    // Add a hover effect to the plane
    plane.components.set(HoverEffectComponent(
        .highlight(HoverEffectComponent.HighlightHoverEffectStyle(color: .blue, strength: 2.0))
    ))
       
    plane.position = [0, 1, -1]
    content.add(plane)
}
.gesture(SpatialTapGesture().targetedToAnyEntity()
    .onEnded { value in
        // Get the entity that was tapped
        let entity = value.entity
        
        // Get the tap position
        let position = value.convert(value.location3D, from: .local, to: .scene)
        
        // Convert the position to the entity's coordinate space
        let positionRelativeTsSphere = entity.convert(position: position, from: nil)
        
        // Create a sphere
        let sphere = ModelEntity(mesh: .generateSphere(radius: 0.01), materials: [UnlitMaterial(color: .red)])
        
        // Position the sphere at the tap location
        sphere.position = positionRelativeTsSphere
        
        // Add the entity to the scene
        entity.addChild(sphere)
    })

This doesn't get you the point on the texture, but it may be enough to meet your needs. If it's not, please provide more details on your end goal?

How precise does the interaction need to be? Have you considered creating separate meshes for each part of the model you want to colorize? For example, imagine a model of a person wearing a shirt and pants. When the user pinch+gazes the shirt you can apply a new material to the mesh associated with the shirt. That would change the color of the shirt only (not the pants). This should work since a model can have multiple meshes. Each mesh can have a InputTargetComponent/CollisionComponent so you can associate gestures with the mesh the gesture originated from and texture is applied at the mesh level.

Consider the following snippet. It simulates your use case. The bottle has 2 meshes, one for the bottle and another for the bottle top and label. Pinch+gaze on portions of the bottle to change its color.

RealityView { content in
    // The snippet below uses the PillBottle model from Realty Composer Pro's content library. Add the pill bottle to your project to follow along.
    if let entity = try? await Entity(named: "PillBottle", in: realityKitContentBundle) {
        
        // not related, but position the bottle so it's in view
        entity.position = [0.0, 1.5, -0.5]
        
        // Add InputTargetComponent CollisionComponent to the individual meshes that compose the model
        if let wrapperAndTop = entity.findEntity(named: "pill_bottle_base_realistic_lod0"),
           let mesh = wrapperAndTop.components[ModelComponent.self]?.mesh,
           let shape = try? await ShapeResource.generateStaticMesh(from: mesh) {
           
            wrapperAndTop.components.set(InputTargetComponent())
            wrapperAndTop.components.set(CollisionComponent(shapes: [shape]))
        }
        
        if let bottle = entity.findEntity(named: "pill_bottle_base_realistic_bottle_lod0"),
           let mesh = bottle.components[ModelComponent.self]?.mesh,
           let shape = try? await ShapeResource.generateStaticMesh(from: mesh) {
           
            bottle.components.set(InputTargetComponent())
            bottle.components.set(CollisionComponent(shapes: [shape]))
        }
        
        content.add(entity)
    }
}
.gesture(SpatialTapGesture().targetedToAnyEntity()
    .onEnded { value in
        // Update the material associated with the mesh
        let entity = value.entity
        let material = UnlitMaterial(color: .red)

        entity.components[ModelComponent.self]?.materials = [material]
    })

I realize I still haven't answered the specific question (gesture location on mesh). I'll keep digging to see what I find. Clearly this won’t work if you want to apply color at a more granular level.

Thanks for your patience. Here's the answer to your specific question.

You can get the texture coordinates using CollisionComponent and raycast. The CollisionCastHit result has TriangleHit structure with primitive indices and barycentric weights (which is named uv in the documentation) that can be used to compute “texture coordinates” at the intersection point. Here's a snippet to demonstrate this. Run the code using a multi colored texture. The app renders a large sphere; tapping on the sphere creates a small sphere whose color matches the texture's color at the tap location.

The reality view

struct ReadTextureColorView: View {
    @State var texture:TextureResource?
    @State var textureCoordinatesReader:TextureCoordinatesReader?
    private let arkitSession = ARKitSession()
    private let worldTrackingProvider = WorldTrackingProvider()
    
    var body: some View {
        
        RealityView { content in
            guard let texture = try? await TextureResource(named: "colorGridTexture") else {return}
            self.texture = texture
            
            let mesh = MeshResource.generateSphere(radius: 0.2)
            
            if let shape = try? await ShapeResource.generateStaticMesh(from: mesh){
                self.textureCoordinatesReader = TextureCoordinatesReader(meshResource: mesh)
                
                var material = SimpleMaterial()
                material.color = .init(texture: .init(texture))
                
                let entity = ModelEntity(mesh: mesh, materials: [material])
                
                entity.position = [0, 1.5, -1]
                entity.components.set(InputTargetComponent())
                entity.components.set(CollisionComponent(shapes: [shape]))
                
                content.add(entity)
            }
        }
        .gesture(SpatialTapGesture().targetedToAnyEntity()
            .onEnded { value in
                // Raycast from the camera to the gesture location
                if let textureCoordinatesReader,
                   let texture,
                   let deviceAnchor = worldTrackingProvider.queryDeviceAnchor(atTimestamp: CACurrentMediaTime()) {
                    
                    let cameraPosition = simd_make_float3(deviceAnchor.originFromAnchorTransform.columns.3)
                    let entity = value.entity
                    let position = value.convert(value.location3D, from: .local, to: .scene)
                    
                    if let results = entity.scene?.raycast(origin: cameraPosition,
                                                           direction: normalize(position-cameraPosition),
                                                           length: 100,
                                                           query: .nearest,
                                                           mask: .all,
                                                           relativeTo: nil),
                       results.isEmpty == false,
                       let triangleHit = results[0].triangleHit {
                        // Get the texture coordinates using the the faceIndex and UV coordinates of the result's triangle hit
                        let coordinates = textureCoordinatesReader.coordinates(for: triangleHit.faceIndex, at: triangleHit.uv)
                        
                        // Read the color of the texture using the coordinates
                        if let color = try? texture.readColor(at: coordinates) {
                            let sphere = ModelEntity(mesh: .generateSphere(radius: 0.01), materials: [UnlitMaterial(color: UIColor(color))])
                            sphere.position = results[0].position
                            entity.parent?.addChild(sphere)
                        }
                    }
                }
                
            })
        .task {
            do {
                try await arkitSession.run([worldTrackingProvider])
            } catch {
                // handle me
                print("Error: \(error)")
            }
        }
    }
}

continued in next post...

...continued from previous reply.

Texture coordinates reader

struct TextureCoordinatesReader {
    private let uvPositions: [SIMD2<Float>]
    private let faceIndices: [UInt16]
    
    init(meshResource:MeshResource) {
        var positions: [SIMD3<Float>] = []
        var uvPositions: [SIMD2<Float>] = []
        var faceIndices: [UInt16] = []
        var indexOffset: Int = 0
        
        for instance in meshResource.contents.instances {
            guard let model = meshResource.contents.models[instance.model] else {
                fatalError("Failed to get mesh model when generating static mesh data")
            }
            
            for part in model.parts {
                let partPositions = part.positions.map { point in
                    let v0 = SIMD4<Float>(point.x, point.y, point.z, 1.0)
                    let v1 = instance.transform * v0
                    return SIMD3<Float>(v1.x, v1.y, v1.z)
                }
                part.textureCoordinates?.forEach { coord in
                    let uv0 = SIMD4<Float>(coord.x, coord.y, 0.0, 1.0)
                    let uv1 = instance.transform * uv0
                    uvPositions.append(SIMD2<Float>(uv1.x, uv1.y))
                }
                positions.append(contentsOf: partPositions)
                
                guard let triangleIndices = part.triangleIndices else {
                    fatalError("Failed to get triangle indices off of mesh model part when generating static mesh data")
                }
                
                faceIndices.append(contentsOf: triangleIndices.elements.map { index in
                    UInt16(Int(index) + indexOffset)
                })
                indexOffset += partPositions.count
            }
        }
        
        self.uvPositions = uvPositions
        self.faceIndices = faceIndices
    }
    
    func coordinates(for faceIndex:Int, at uv: SIMD2<Float>) -> SIMD2<Float> {
        
        // Get indices for each triangle vertex for the face
        let vertexTexCoords: [SIMD2<Float>] = [
            Int(faceIndices[faceIndex * 3 + 0]),
            Int(faceIndices[faceIndex * 3 + 1]),
            Int(faceIndices[faceIndex * 3 + 2])
        ].map { i in
            uvPositions[i]
        }
        
        let u = uv.x
        let v = uv.y
        let w = 1 - u - v
        
        let textureCoordinates = w * vertexTexCoords[0] + u * vertexTexCoords[1] + v * vertexTexCoords[2]
        
        return textureCoordinates
    }
}

Extension to read the texture color given a texture coordinate

extension TextureResource {
    func readColor(at coordinates:SIMD2<Float>) throws -> Color? {
        guard let mtlTexture = MTLCreateSystemDefaultDevice()?.makeTexture(descriptor: {
            let descriptor = MTLTextureDescriptor()
            descriptor.width = Int(width)
            descriptor.height = Int(height)
            descriptor.pixelFormat = .rgba8Unorm
            descriptor.usage = [.shaderRead, .shaderWrite]
            return descriptor
        }()) else {return nil}
        try copy(to: mtlTexture)
        var bytes: [UInt8] = .init(repeating: 0, count: 4)
        bytes.withUnsafeMutableBytes { ptr in
            mtlTexture.getBytes(
                ptr.baseAddress!,
                bytesPerRow: mtlTexture.bufferBytesPerRow,
                from: .init(origin: MTLOrigin(
                    x: Int(coordinates.x * Float(width)),
                    y: Int((1 - coordinates.y) * Float(height)),
                    z: 0
                ),size: MTLSize(width: 1, height: 1, depth: 1)),
                mipmapLevel: 0
            )
        }
        
        let r = Double(bytes[0]) / 255.0
        let g = Double(bytes[1]) / 255.0
        let b = Double(bytes[2]) / 255.0
        let a = Double(bytes[3]) / 255.0
        
        return Color(red: r, green: g, blue: b, opacity: a)
    }
}
Can I get a point on a texture in realityKit on VisionOS?
 
 
Q