Hi,
I'm currently working on an ARKit project where I need to implement object occlusion on devices that do not have a LiDAR sensor (e.g., iPhone XR, iPhone 11).
I used CoreML models like DepthAnythingV2 to create depth maps and DETRResnet50SemanticSegmentationF16P8 to to perform real-time segmentation. But these models are too heavy for devices.
Much appreciated on any advice or pointers to resources.
Discuss spatial computing on Apple platforms and how to design and build an entirely new universe of apps and games for Apple Vision Pro.
Post
Replies
Boosts
Views
Activity
I am working on adding synchronized physical properties to EntityEquipment in TableTopKit, allowing seamless coordination during GroupActivities sessions between players.
Current Approach and Limitations
I have tried setting EntityEquipment's state to DieState and treating it as a TossableRepresentation object. This approach achieves basic physical properties synchronized across players. However, it has several limitations:
No Collision Detection Between Dice: Multiple dice do not collide with each other.
Shape Limitations: Custom shapes, like parallelepipeds, cannot be configured.
Below is my existing code for Base Entity Equipment without physical properties:
struct CubeWithPhysics: EntityEquipment {
let id: ID
let entity: Entity
var initialState: BaseEquipmentState
init(id: ID, entity: Entity) {
self.id = id
self.entity = entity
initialState = .init(parentID: .tableID, pose: .init(position: .zero, rotation: .zero), entity: self.entity)
}
}
I’d appreciate any guidance on the recommended approach to adding synchronized physical properties to EntityEquipment.
I'm writing code using ObjectAnchor for Vision OS. If an object is tracked, and then becomes not visible (either because the user looked in a different direction, or because the tracked object was occluded by another object), it is still tracked and you get anchor updates (e.g., object permanence).
For my application, it would be very helpful if I could determine if the object is currently being observed, or it is not currently observed and just assumed to be in the same location as seen previously.
ObjectAnchor.isTracked just seems to indicate whether it is getting anchor updates. I don't see anything in the ObjectAnchor or AnchorUpdate that would allow me to determine if the object is currently observed. Does anyone know of a way to do this, or would this be a feature request?
Regarding the Apple Vision Pro, is there a possibility to get a kind of rental device or leasing device from Apple?
Background:
I'm a fresh VisionOS dev.
I have a low budget.
I'm living in Japan. That means Vision Pro is roughly 150% the USD price.
Regards
I created an Object & Hand Tracking app based on the sample code released here by Apple.
https://developer.apple.com/documentation/visionos/exploring_object_tracking_with_arkit
The app worked great and everything was fine, but I realized I was coding on Xcode 16 beta 3, so I installed the latest Xcode 16 from the App Store and tested by app there, and it completely crashed. No idea why. Here is the console
dyld[1457]: Symbol not found: _$ss13withTaskGroup2of9returning9isolation4bodyq_xm_q_mScA_pSgYiq_ScGyxGzYaXEtYas8SendableRzr0_lF
Referenced from: <3AF14FE4-0A5F-381C-9FC5-E2520728FC65> /private/var/containers/Bundle/Application/F74E88F2-874F-4AF4-9D9A-0EFB51C9B1BD/Hand Tracking.app/Hand Tracking.debug.dylib
Expected in: <2F158065-9DC8-33D2-A4BF-CF0C8A32131B> /usr/lib/swift/libswift_Concurrency.dylib
It was working perfectly fine on Xcode 16 beta 3, which makes me think it's an Xcode 16 issue, but no idea how to fix this. I also installed Xcode 16.2 beta (the newest beta) but same error.
Please help if anyone knows what is wrong!
As mentioned in https://forums.developer.apple.com/forums/thread/756736?answerId=810096022#810096022
Is there any update about the full support to WebXR AR Module, which should enable immersive-ar mode?
Are the features such as DOM overlays and WebGPU bindings on the roadmap?
Is it possible to capture stereoscopic video either internally or externally or via airplay for debugging purposes?
Thanks
I am trying to create reflections and lighting in a fully virtual scene, and the way the documentation is written, the VirtualEnvironmentProbeComponent would be key. I cannot seem to get it to work, however. Has anyone got this figured out?
I'm seeking detailed information about the rotation matrix of the iPhone's front-facing (selfie) camera when using ARKit.
Specifically, I need to understand:
The exact rotation matrix applied to the front-facing camera's output in ARKit.
Whether this matrix is consistent across all iPhone models or if there are variations.
If there are any transformations applied to align the camera's coordinate system with the device's orientation, particularly in portrait mode.
How this rotation matrix relates to the transform property of `ARFrame.camera
Hi, I’m working on a portfolio project for Vision Pro these days.
I have two projects and each of projects are made with Unity and made with Xcode(using ARKit and RealityKit tracking feature). Is it able to combine these two projects in an app?
For example, using the buttons made with SwiftUI in a Reality Composer Pro, jumping to a scene in Unity, and back from a scene in Unity to a scene in Reality Composer Pro in an app.
Hi, currently tinkering with a little shareplay app for the Vision Pro that allows people to facetime and shareplay to play with random 3d models (as well as move them around, which should sync the model positions for everyone in relative space).
When the users start their facetime call, then open the immersive space to see the 3d models, the models load in properly in context of the group immersive space's coordinate system, and moving the models reflects the new positions real-time for each participant.
The main issue comes if/when users use the digital crown to re-center their view. It appears to re-center the model and view, which is expected. However, it also seems to re-position the model/root entity to match the user's origin. Not sure if this is intentional or not, but this essentially makes it so that it "de-syncs" the model (so me moving the model next to someone does not reflect it 1:1 - it still moves properly, but the new "initial" position after re-centering makes it offset).
Is there a potential solution or work-around for this such that re-centering the view doesn't de-sync the model/entity's position?
Rough code for my RealityView component is below:
RealityView { content, attachments in
content.add(appModel.originEntity)
appModel.originEntity.addChild(appModel.modelContainerEntity)
appModel.setInitialModelPosition()
configureGestures(forModel: appModel.modelContainerEntity)
configureToolbarAttachment(content: content, attachments: attachments)
} update: { content, _ in
// I have modified the Apple provided gesture components to
// send the app model the new positions/rotations
// as well as broadcast the position/rotation to shareplay participants
// When user re-centers view, it seems to also re-position the model
// so that its origin is at the local user's origin, rather than
// the original origin
// Can we receive a notification that user has re-centered view?
// Or some other work-around?
appModel.modelContainerEntity.setPosition(appModel.modelState.position, relativeTo: nil)
appModel.modelContainerEntity.setOrientation(.init(appModel.modelState.rotation3d), relativeTo: nil)
} attachments: {
Attachment(id: "customViewAttachment") {
CustomView()
}
}
.installGestures()
Please let me know if anything wasn't clear or if more information is needed. Thanks!
if you check the code here,
https://developer.apple.com/documentation/compositorservices/interacting-with-virtual-content-blended-with-passthrough
var body: some Scene {
ImmersiveSpace(id: Self.id) {
CompositorLayer(configuration: ContentStageConfiguration()) { layerRenderer in
let pathCollection: PathCollection
do {
pathCollection = try PathCollection(layerRenderer: layerRenderer)
} catch {
fatalError("Failed to create path collection \(error)")
}
let tintRenderer: TintRenderer
do {
tintRenderer = try TintRenderer(layerRenderer: layerRenderer)
} catch {
fatalError("Failed to create tint renderer \(error)")
}
Task(priority: .high) { @RendererActor in
Task { @MainActor in
appModel.pathCollection = pathCollection
appModel.tintRenderer = tintRenderer
}
let renderer = try await Renderer(layerRenderer,
appModel,
pathCollection,
tintRenderer)
try await renderer.renderLoop()
Task { @MainActor in
appModel.pathCollection = nil
appModel.tintRenderer = nil
}
}
layerRenderer.onSpatialEvent = {
pathCollection.addEvents(eventCollection: $0)
}
}
}
.immersionStyle(selection: .constant(appModel.immersionStyle), in: .mixed, .full)
.upperLimbVisibility(appModel.upperLimbVisibility)
the only way it's dealing with the error is fatalError.
And don't think I can throw anything or return anything else?
Is there a way I can gracefully handle this and show a message box in UI?
I was hoping I could somehow trigger a failure and have https://developer.apple.com/documentation/swiftui/openimmersivespaceaction return fail.
but couldn't find a nice way to do so.
Let me know if you have ideas.
As the title states, this severely limits the flexibility of multi-window applications in creating a good user experience.
Even effects like the ones shown below cannot be achieved.
Hi everyone,
I'm looking for some guidance on how to create an .arobject file. Is there a way to generate it from a 3D model (like a .objcap or .usdz or .fbx/.obj file)? Or can it only be generated by scanning a real-world object using the ARKitScanner project? Any advice or resources on this would be greatly appreciated!
I've only found this project for scanning real-world objects.
Thanks in advance!
Hello Apple Team,
Is it possible to change the zoom factor, exposure, white balance and other settings, of an iOS ARKit session?
I know how to do it using an AVCaptureSession,
however, I can't figure out how to access the AVCaptureDeviceInput of the current AR session.
Thanks
PS: I'm using ARkit and RealityKit on iOS 17
I am new here, my name is Axel - Hi! 👋 I am both a visionOS beginner and an expert in things 3D, now figuring my way into Reality Composer Pro. Most things are intuitive and easy to understand, yet there is one thing I cannot seem to figure out, and I feel really stupid because it must be there, and that's Keyframing: I would simply make a new Timeline and animate published parameters in a Shader Graph over time.
I know how to do this via Xcode and Custom Components, but that can't be it because that will break even on simple to medium animation in terms of previewing and fine-tuning, complete overkill for a simple value animation. Since key framing is the most basic functionality of 3D apps next to move, I am sure it is in RCP somewhere.
Anyone got a pointer for me?
I found that there is such a click-to-expand horizontally and smoothly effect in the system application called "message", which is good. I wonder if I can add a similar effect to my own app. If possible, are there any implementation ideas or examples that I can refer to? Thanks!
Hi everyone,
I need to synchronize the playback of RealityKit Timelines via SharePlay.
To do this I am trying to get the references of the timelines using "AnimationPlaybackController" and "AnimationResource". In my realitykit scene I have configured both an animation (with blender), and a timeline, the animation starts correctly when the realitykit scene starts, the timeline not.
Below the code:
struct ContentView: View {
@State private var subscriptions = [EventSubscription]()
@Environment(AppModel.self)
private var appModel
let rootEntity = Entity()
@State var testEntity: Entity?
@State var testAnimation: AnimationResource?
@State var testController: AnimationPlaybackController?
init() {
CubeComponent.registerComponent()
}
var body: some View {
RealityView { content in
content.add(rootEntity)
if let scene = try? await Entity(named: "Room", in: realityKitContentBundle) {
rootEntity.addChild(scene)
playAnimations(from: content)
}
}
.gesture(SpatialTapGesture().targetedToAnyEntity()
.onEnded({ value in
_ = value.entity.applyTapForBehaviors()
if let testEntity, let testAnimation {
testController = testEntity.playAnimation(testAnimation.repeat())
}
})
)
}
func playAnimations(from content: RealityViewContent) {
subscriptions.append(content.subscribe(to: ComponentEvents.DidAdd.self, componentType: AnimationLibraryComponent.self, { event in
let entity = event.entity
entity.components[AnimationLibraryComponent.self]?.animations.forEach({ (key, value) in
if value.definition is AnimationGroup {
if key == "/Room/TestTimeline" {
let controller = entity.playAnimation(value.repeat())
testEntity = entity
testAnimation = value
appModel.syncronizedAnimations[key] = .init(name: key, animationController: controller, entityName: entity.name)
}
} else {
if entity.name == "SphereInteractable" {
let controller = entity.playAnimation(value.repeat())
appModel.syncronizedAnimations[key] = .init(name: key, animationController: controller, entityName: entity.name)
}
}
})
}))
}
}
the variables testEntity, testAnimation and testController are for testing purposes only. If I try to start the animations in the playAnimations function, only the animation created via blender starts (the one related to the object "SphereInteractable"), the Timeline starts only if I save a reference and I play it with a tap gesture or with a delay of ! seconds with DispatchQueue.asyncAfter called in the onAppear.
is there a better way to handle this? The goal is to have a reference of the AnimationPlaybackController of the timeline, in order to sync the animation via shareplay.
Thanks
I've tried importing a 3D KTX file into Reality Composer Pro ShaderGraph, but got nothing output. Realitykit doesn't support Image3D yet?
First, I use pyktx to create a 3D KTX texture file and it seems correct in Finder preview.
import numpy as np
from pyktx import KtxTexture2, KtxTextureCreateInfo, KtxTextureCreateStorage, VkFormat
size = 32
image = np.zeros((size, size, size, 3), dtype=np.uint8)
for i in range(size):
image[i, :, :, 0] = np.interp(i, [0, size], [50, 255])
image[:, i, :, 1] = np.interp(i, [0, size], [100, 255])
image[:, :, i, 2] = np.interp(i, [0, size], [150, 255])
info = KtxTextureCreateInfo(
gl_internal_format=None,
vk_format=VkFormat.VK_FORMAT_R8G8B8_SRGB,
base_width=image.shape[2],
base_height=image.shape[1],
base_depth=image.shape[0],
num_dimensions=3,
num_levels=1,
num_layers=1,
num_faces=1,
generate_mipmaps=False,
is_array=False,
)
texture = KtxTexture2.create(info, KtxTextureCreateStorage.ALLOC)
for _ in range(1):
texture.set_image_from_memory(0, 0, _, image[_].tobytes())
texture.write_to_named_file(f'{size}.ktx')
Then I import it into ShaderGraph like this but seems nothing could be read.
In addition, I tested 2D KTX file and it worked.
I also tested different vk_format or KTX1/KTX2 but did not work.
I also tested Image3DPixel/Image2DArray/Image3DRead and none of them work as long as it's 3D.
I am able to create a custom node graph by selecting nodes and then choosing the "Compose Node Graph" option in the context menu. After that, when I select my custom node graph, I see in the top-right panel that it is possible to define inputs and outputs. However, I was not able to figure out how to link those to the inputs and outputs in the underlying nodes.
Hi, I am trying to stream spatial video in realtime from my iPhone 16.
I am able to record spatial video as a file output using:
let videoDeviceOutput = AVCaptureMovieFileOutput()
However, when I try to grab the raw sample buffer, it doesn't include any spatial information:
let captureOutput = AVCaptureVideoDataOutput()
//when init camera
session.addOutput(captureOutput)
captureOutput.setSampleBufferDelegate(self, queue: sessionQueue)
//finally
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
//use sample buffer (but no spatial data available here)
}
Is this how it's supposed to work or maybe I am missing something?
this video: https://developer.apple.com/videos/play/wwdc2023/10071 gives us a clue towards setting up spatial streaming and I've got the backend all ready for 3D HLS streaming. Now I am only stuck at how to send the video stream to my server.