I am working on an app where we are attempting to place large entities quite far away from the user, when trying to recognise a tap gesture on them though the gesture isn't being picked up for part of the model.
It seems as though the larger and further a model is placed the more offset the collision shape seems to be. It responds to taps in a region that shrinks towards the bottom right. The actual size of the collision shape appears to be correct when viewed with the collision shape debug visualisation. I've been able to replicate this behaviour in the simulator and on a physical device.
It's hard to explain in words, there's a video in the README for the repo here
I've been able to replicate the issue in a simple sample app. Not sure if I might be using it wrong or if it is expected behaviour for tap gestures to be a bit off when places a large distance from the user. Appreciate any help, thanks.
struct ImmersiveView: View {
@State private var tapCount = 0
var body: some View {
RealityView { content in
let sphere = ModelEntity(mesh: .generateSphere(radius: 50), materials: [UnlitMaterial(color: .red)])
sphere.setPosition([500, 0, 0], relativeTo: nil)
CollisionComponent(shapes: [.generateBox(width: 250, height: 250, depth: 250)]),
.onEnded { value in
tapCount += 1
I would think it would be common practice that when adding a new entity into your RealityView scene for them to appear in front of the user. And then the user places the entity in the scene. Image a puzzle piece appearing in front of you and you drag it to your puzzle board.
if you move around your puzzle board you’d expect that wherever you are the new piece should appear in front of you.
That seems applicable to a lot of applications.
I can add a new entity using the head anchor but as we all know that transform is the identity so reparenting the entity to something (eg puzzle board) won’t work.
I’ve been trying to use World positioning and query pose which helps but I’m stumped as to how to get the new entity to appear in front of me, no matter which way I turn.
Looking for suggestions and guidance on this.
What is the current recommendation for creating high-quality 3D content?
The context is a hobbyist, specialised CAD app for macOS (with an iPadOS companion) that is mostly 2D but also offers a 3D visualization option (currently OpenGL).
Somewhere down the line there might be an AR view but at the moment - certainly for macOS - it's purely generated 3D visualization, all rendered content.
So starting with a rewrite of the 3D visualization in 2024 targeting macOS Sequoia/iPadOS 18 is RealityKit the suggested way forward?
import SwiftUI
import RealityKit
import ARKit
import AVFoundation
struct ContentView : View {
var body: some View {
struct ARViewContainer: UIViewRepresentable {
func makeUIView(context: Context) -> ARView {
let arView = ARView(frame: .zero)
arView.session.delegate = context.coordinator
let worldConfig = ARWorldTrackingConfiguration()
worldConfig.planeDetection = .horizontal
// worldConfig.providesAudioData = true // open here -----> Error:
addTestEntity(arView: arView)
return arView
func updateUIView(_ uiView: ARView, context: Context) {}
func makeCoordinator() -> Coordinator {
class Coordinator: NSObject, ARSessionDelegate, ARSessionObserver {
func session(_ session: ARSession, didOutputAudioSampleBuffer audioSampleBuffer: CMSampleBuffer) {
func addTestEntity(arView: ARView) {
let mesh = MeshResource.generatePlane(width: 0.5, depth: 0.35)
guard let url = Bundle.main.url(forResource: "videoplayback", withExtension: "mp4") else { return }
let player = AVPlayer(url: url)
let videoMaterial = VideoMaterial(avPlayer: player)
let model = ModelEntity(mesh: mesh, materials: [videoMaterial])
model.transform.translation.y = 0.05
let anchor = AnchorEntity(.plane(.horizontal, classification: .any, minimumBounds: SIMD2<Float>(0.2, 0.2)))
ARSession <0x125d88040>: did fail with error: Error Domain=com.apple.arkit.error Code=102 "Required sensor failed." UserInfo={NSLocalizedFailureReason=A sensor failed to deliver the required input., NSUnderlyingError=0x302922dc0 {Error Domain=AVFoundationErrorDomain Code=-11819 "Cannot Complete Action" UserInfo={NSLocalizedDescription=Cannot Complete Action, NSLocalizedRecoverySuggestion=Try again later.}}, NSLocalizedRecoverySuggestion=Make sure that the application has the required privacy settings., NSLocalizedDescription=Required sensor failed.}
iOS 17.5.1
Xcode 15.4
I'm trying to attach one entity to another entity via the new PhysicsFixedJoint. I have a usdz that contains a skeletal pose which expose the joints as pins as desired. However the when I access the pin, it is returning a GeometricPin, instead of an EntityGeometricPin as you would expect. I can't use the returned GeometricPin to create the joint.
Am I missing something? Shouldn't access the Entity's pins object return EntityGeometricPins instead of GeometricPin?
Here is the code sample:
var body: some View {
RealityView { content in
if let scene = try? await Entity(named: "Scene", in: untitledBundle) {
let attack = try! Entity.load(named: "Attack01_SingleSword")
let anchor = scene.findEntity(named: "Root")
let sword = try! Entity.load(named: "OHS08_Sword")
if let swordEntity = findModelComponentEntity(entity: sword) {
let swordPin = swordEntity.pins.set(
named: "test", position: SIMD3<Float>.zero
if let attackEntity = findModelComponentEntity(entity: attack) {
let attackPin = attackEntity.pins["root/pelvis/spine_01/spine_02/spine_03/clavicle_r/upperarm_r/lowerarm_r/hand_r/weapon_r"]! // This is returning GeomtricPin instead of the EntityGeometricPin that the "pins" object contains
let joint = PhysicsFixedJoint(
pin0: swordPin,
pin1: attackPin // This is a compile error since it is not an EntityGeometricPin type
try! joint.addToSimulation()
The WWDC24 video "Build a spatial drawing app with RealityKit" https://developer.apple.com/wwdc24/10104 at 12:04 includes a slide showing a Reality Composer Pro shader graph that features wonderful inline documentation comment boxes:
Are shader graph inline comments a new feature that Reality Composer Pro supports? This would be extraordinarily useful, as complex shader graphs can be challenging to decipher.
If so, how are inline shader graph comments created in Reality Composer Pro?
I'm choosing a framework for developing a game that doesn't involve augmented reality (AR) and I'm unsure whether to use SceneKit or RealityKit. I would like to hear from Apple engineers on this matter. Which of these frameworks is better suited for creating non-AR games?
Additionally, I'd like to know if it's possible to disable AR in RealityKit using the updated RealityView? Thanks in advance for your insights and recommendations!
I am trying to establish a workflow with using Reality Composer Pro to make scenes - I am grey boxing a scene using primitives at the moment.
I have set up a cube with a texture material and a simple animation to spin.
I am confused as to what I should be loading. I have created what I think is a scene asset in the package for the Reality Composer Project.
Here is a code snippet:
struct ContentView: View {
var body: some View {
RealityView { content in
do {
let scene = try await ModelEntity(named: "HOF")
} catch {
print("Error loading scene: \(error.localizedDescription)")
Here is the project layout in Reality Composer Pro:
I'd like to create meshes in RealityKit ( AR mode on iPad ) in screen-space, i.e. for UI.
I noticed a lot of useful new functionality in RealityKit for the next OS versions, including the OrthographicCameraComponent here:
I think this would help, but I need AR worldtracking as well as a regular perspective camera to work with the 3D elements.
Firstly, can I have a camera attached selectively to a few entities, just for those entities? This could be the orthographic camera.
Secondly, can I make it so those entities are always rendered in-front, in screenspace? (They'd need to follow the camera.)
If I can't have multiple cameras, what can be done in that case?
Is it actually better to use a completely different view / API for layering on-top of RealityKit? I would much rather keep everything in RealityKit, however, for simplicity.
I'm trying to render a large number of entities, it looks like each ModelEntity causes a draw call, even if you share the ModelComponent so each Entity shares the mesh and materials.
I tried to use the MeshInstanceCollection inside MeshResource to generate a large number of objects in the scene, the code works and draws many objects but the draw count is still one call per instance, this seems strange I would assume it should only be one draw call for the single entity since I have specified to use instancing in the resource.
Has anybody else successfully used instancing in RealityKit to draw a large number go Entities (maybe around 10,000) or drawn this amount of items successfully with 60fps any other way?
Here is some sample code that draws 100 cubes using instancing but still causes 100 draw calls.
func instanceTest(scene: RealityKit.Scene) {
var resource = MeshResource.generateBox(size: 0.2)
var contents = MeshResource.Contents()
contents.models = resource.contents.models
var arr: [MeshResource.Instance] = []
var matrix = matrix_identity_float4x4
matrix[3, 0] = 0.5
for i in 0..<100 {
let inst = MeshResource.Instance(id: "\(i)", model: "MeshModel", at: matrix)
contents.instances = MeshInstanceCollection(arr)
let updatedResource = try? MeshResource.generate(from: contents)
let unlitMaterial = UnlitMaterial(color: .red)
let modelEntity = ModelEntity(
mesh: updatedResource!,
materials: [unlitMaterial]
let anchor = AnchorEntity()
I am currently considering porting my AR game from SceneKit to RealityKit so it appears in 3D on Vision OS, but one crucial question to knonw if it can even be ported is:
I need a tap on a 3D Model that is than translated into a tap onto the texture of that model.
On visionOS this would be the gaze of the person so the question is if I can get the point of a texture is user is looking at (optimally always to I can have a hover effect, but if not at least when tapping the finger together).
If that is not possible, is it possible to touch a reality kit object and get the location of that on the texture?
If you create a custom shader you get access to a collection of uniform values, one is the uniforms::time() parameter which is defined as "the number of seconds that have elapsed since RealityKit began rendering
the current scene" in this doc: https://developer.apple.com/metal/Metal-RealityKit-APIs.pdf
Is there some way to get this value from Swift code? I want to animate a value in my shader based on the time so I need to get the starting time value so I can interpolate the animation offset from that point. If I create a System in the update() function I get a SceneUpdateContext instance and that has a deltaTime property but not an elapsedTime property which I would assume would map to the shader time() value.
I'm playing around with RealityKit to see if I can re-use the content for both iOS/macOS and visionOS with the new betas.
For the 2D devices, I'm looking at a more traditional, non AR setup. I've fallen over at the first hurdle; dragging an object around on a plane just as a test of how things all work.
I was trying to unproject from a plane to the view/window co-ordinates and move the box around based on the result.
The code below works if I angle the plane, weirdly; but not if the plane (as I understand it) is 'flat' on the ground.
Am I doing this the wrong way?
It behaves in a similar fashion with both the default PerspectiveCameraComponent and OrthographicCameraComponent.
import SwiftUI
import RealityKit
struct ContentView: View {
var body: some View {
RealityView { content in
let cubemesh = MeshResource.generateBox(size: 0.2, cornerRadius: 0.05)
let cubeModel = ModelEntity(mesh: cubemesh)
cubeModel.generateCollisionShapes(recursive: false)
let cameraEntity = Entity()
let cameraPosition: SIMD3<Float> = [10, 10, 5]
let target: SIMD3<Float> = .zero
cameraEntity.look(at: target, from: cameraPosition, relativeTo: nil)
.gesture(DragGesture(coordinateSpace: .global)
.onChanged() { value in
let planeTransform = Transform(scale: SIMD3<Float>(1, 1, 1),
rotation: simd_quatf(angle: 0, axis: SIMD3<Float>(0, 1, 0)),
translation: SIMD3<Float>(0, 0, -1))
#if !os(visionOS)
if let placementPosition = value.unproject(value.location, from: .global, to: .scene, ontoPlane: (planeTransform.matrix))
print("projected value:", placementPosition)
value.entity.position.x = placementPosition.x
value.entity.position.y = placementPosition.y
value.entity.position.z = 0
#if os(visionOS)
#Preview("3D Device", windowStyle: .volumetric) {
if #available(visionOS 2.0, *) {
.frame(depth: 1300)
.frame(width: 1280)
.frame(height: 1280)
} else {
#Preview("2D Device") {
The title says it all. I'm not seeing a way to utilize debugOptions when using the SwiftUI RealityView on iOS in the Xcode 16 Beta. Am I missing something? Seems to only be available when using ARView.
我在 window 上层添加scrollerview 并添加 realityview ,实现传送门效果,发现 PortalComponent 效果在某些设备上面无法生效,渲染层级混乱
I have code such as the following. The performance on the Vision Pro seems to get quite bad once I hit a few thousand of these models. It feels like I should be able to optimise this somehow, perhaps using instancing. Is that possible with RealityKit in visionOS 2?
let material = UnlitMaterial(color: .white)
let sphereModel = ModelEntity(
mesh: .generateSphere(radius: 0.001),
materials: [material])
for index in 0..<5000 {
let point = generatedPoints[index]
let model = sphereModel.clone(recursive: false)
model.position = [point.x, point.y, point.z]
I have some entities which use attachments to show a label next to them. I would like to change this to only show the label when the entity is being looked at / hovered over. I have the new HoverEffect component on my entity that works nicely, but I can't see how I toggle the visibility of the labels.
Currently, it is not possible to achieve the occlusion effect of the model through depth reading and writing and rendering order, even in the vision 2.0 beta version, this goal cannot be achieved
I have a visionOS app that utilizes DrawableQueue and CADisplayLink to update an Entity, TextureResource tied to the drawable, and a Material that uses that TextureResource. TextureResource gets updated with when a video frame is ready. Material properties can get updated from the video or from other sources.
Current process: when each video frame is ready, we get the next drawable, render to it, present it, and make an Entity update (e.g. transform). However, I’m experiencing jitter in the rendered content where it seems that the updates to the entity and the drawable being presented are milliseconds off from each other.
Should I be using Drawable.presentOnSceneUpdate() to ensure all updates happen in the same update cycle? And if so, do you have any additional details on how to correctly use this function (the docs are unclear)?
I am using RealityKit along with ARKit and Swift UI to develop an app where I am augmenting a usdz model of a complex geometry like that of a car.
I have some other usdz files with a simple plane geometry having the material properties embedded within them which also i am loading as model entities.
I want to traverse through my car usdz file such that i can pick the material from simple usdz file and apply it to the car as car paint. To do this i know the name of the mesh holding the car paint as well as the name of the material applied.
I have tried to traverse through the usdz files using both RealityKit and SceneKit but I am not successful to reach to the lowest mesh and copy the material properties to it.
With RealityKit, I have tried to get the instance data using modelEntity as follows :-
"sourceModel?.model?.mesh.contents.instances". But this returns instance id, model name and transform only.
Any help will be highly appreciated.
