Hi,
since iOS 15 I've repeatedly noticed the console warning »ARSessionDelegate is retaining X ARFrames. This can lead to future camera frames being dropped« even for rather simple projects using RealityKit and ARKit. Could someone from the ARKit team please elaborate what causes this warning and what can be done to avoid it?
If I remember correctly I didn't even assign an ARSessionDelegate.
Thank you!
ARKit
RSS for tagIntegrate iOS device camera and motion features to produce augmented reality experiences in your app or game using ARKit.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Created
Hi everyone! I am working on AR app and wanted to implement object occlusion because it removes drift pretty much from the object. This working great with RealityKit sample But I am unable to replicate such behaviour it with scenekit. Because scenekit does not offer object occlusion. Can we say scenekit is getting depricated, and we should re-write app in RealityKit (which is obviously a big task)?
Error:
RoomCaptureSession.CaptureError.exceedSceneSizeLimit
Apple Documentation Explanation:
An error that indicates when the scene size grows past the framework’s limitations.
Issue:
This error is popping up in my iPhone 14 Pro (128 GB) after a few roomplan scans are done. This error shows up even if the room size is small. It occurs immediately after I start the RoomCaptureSession after the relocalisation of previous AR session (in world tracking configuration). I am having trouble understanding exactly why this error shows and how to debug/solve it.
Does anyone have any idea on how to approach to this issue?
We are working on a world scale AR app that leverages the device location and heading to place objects in the streets, so that they are correctly and stably anchored to certain locations.
Since the geo-tracking imagery is only available in certain cities and areas, we are trying to figure out how to fallback when geo-tracking is not available as the device move away, to still retain good AR camera accuracy. We might need to come up with some algorithm using the device GPS, to line up the ARCamera with our objects.
Question: Does geo-tracking always provide greater than or equal to the accuracy of world tracking, for a GPS outdoor AR experience?
If so, we can simply use the ARGeoTrackingConfiguration for the entire time, and rely on the ARView keeping itself aligned. Otherwise, we need to switch between it and ARWorldTrackingConfiguration when geo-tracking is not available and/or its accuracy is low, then roll our own algorithm to keep the camera aligned.
Thanks.
I am working with MeshAnchors, and I am having troubles getting to the classification of the triangles/faces.
This post references the MeshAnchor.Geometry, and that struct does have a property named "classifications", but it is of type GeometrySource. I cannot find any classification information in GeometrySource. Am I missing something there?
I think I am looking for something of type MeshAnchor.MeshClassification, but I cannot find any structs with this as a property.
Hello Community,
I'm encountering an issue with the latest iOS 17 update, specifically related to RoomPlan version-2. In iOS 16, when using RoomPlan version-1, we were able to display stairs in our app. However, after upgrading to iOS 17 and implementing RoomPlan version-2, the stairs are no longer visible.
Despite thorough investigation, I couldn't find any option within the code to show or hide stairs, or any other objects for that matter. It seems like a specific issue with the update rather than a coding error on our part.
Has anyone else encountered a similar problem? If so, I would greatly appreciate any insights or solutions you might have. It's crucial for our app functionality to have stairs displayed accurately, and we're currently at a loss on how to address this issue.
Thank you in advance for any assistance you can provide.
Best regards
What is the reason the hand-tracking joints have these axes? I'm trying to create a virtual hands model and that's a mess.
Hello All,
I'm desperate to found a solution and I need your help please.
I've create a simple cube in Vision OS. I can get it by hand (close my hand on it) and move it pretty where I want. But, I would like to throw it (exemple like a basket ball). Not push it, I want to have it in hand and throw it away of me with a velocity and direction = my hand move (and finger opened to release it).
Please put me on the wait to do that.
Cheers and thanks
Mathis
Topic:
Spatial Computing
SubTopic:
ARKit
After upgrading to Xcode 16 my app, which utilizes imported project files from my iPad's Reality Composer app, now has two issues that I have found so far. I am using an ARView as a UIViewRepresentable with SwiftUI. (Prior to upgading to Xcode 16 everything worked well.)
First, there are now several duplicate rcp_export.usdz resources in the "Copy Bundle Resources" build phase section. Even though each file is in a separate folder with a unique UUID, it was causing a compile error saying there are duplicate files. I was able to open the RC project folder and delete the older rcp_project versions which now allows the app to compile. I mention it as it may or may not be related to the second issue.
Second, Xcode isn't generating the project code for rcproject, so when I call the RCProject.loadSceneAsync function I am getting an error that says "Cannot find 'RCProject' in scope"
I am working on a project that requires access to the main camera on the Vision Pro. My main account holder applied for the necessary enterprise entitlement and we were approved and received the Enterprise.license file by email. I have added the Enterprise.license file to my project, and manually added the com.apple.developer.arkit.main-camera-access.allow entitlement to the entitlement file and set it to true since it was not available in the list when I tried to use the + Capability button in the Signing & Capabilites tab.
I am getting an error: Provisioning profile "iOS Team Provisioning Profile: " doesn't include the com.apple.developer.arkit.main-camera-access.allow entitlement. I have checked the provisioning profile settings online, and there is no manual option for adding the main camera access entitlement, and it does not seem to be getting the approval from the license.
We are using the ARKit image tracking feature on visionOS 2.0 with three pre-registered images. The image tracking works, but only one image is actively tracked at a time. When more than one target image is visible to the camera, it has difficulty detecting and tracking the other images.
Is this the expected behavior in visionOS, or is there something we need to do to resolve this issue?
We're developing a VisionOS application, where we would like to do product recognition (like food items).
We have enterprise entitlements and therefore also main camera access for VisionOS. We send this live camera frames to a trained CoreML model where we will receive 2D coordinates from the model detection prediction.
Now, we would like to create a 3D anchor on the detected items so it can be visible for user. The 3D anchor is going to be the class name of the detected item.
How do we transform this 2D coordinate from the model prediction to a 3D anchor?
I was watching the Developer videos, and there was mention that RealityView handles persistent world data differently and also automatically for us.
I am having an issue finding the material I need to get up to speed on that.
In ARKit, I was able to place a model with the world data and recall that .map data. It even stored a reference image for the scene to help match the world data.
I'm looking for the information on how to implement and work with those same features with RealityView, as it seems to be better/automatically integrated?
I need help being pointed in the right direction. Sample code would be amazing.
Topic:
Spatial Computing
SubTopic:
ARKit
I'm trying to implement a prototype to render virtual objects in a mixed immersive space on the camer frames captured by CameraFrameProvider.
Here are what I have done:
Get camera's instrinsics from frame.primarySample.parameters.intrinsics
Get camera's extrinsics from frame.primarySample.parameters.extrinsics
Get the device anchor by worldTrackingProvider.queryDeviceAnchor(atTimestamp: CACurrentMediaTime())
Setup a RealityKit.RealityRenderer to render virtual objects on the captured camera frames
let realityRenderer = try RealityKit.RealityRenderer()
realityRenderer.cameraSettings.colorBackground = .outputTexture()
let cameraEntity = PerspectiveCamera()
// see https://developer.apple.com/forums/thread/770235
let cameraTransform = deviceAnchor.originFromAnchorTransform * extrinsics.inverse
cameraEntity.setTransformMatrix(cameraTransform, relativeTo: nil)
cameraEntity.camera.near = 0.01
cameraEntity.camera.far = 100
cameraEntity.camera.fieldOfViewOrientation = .horizontal
// manually calculated based on camera intrinsics
cameraEntity.camera.fieldOfViewInDegrees = 105
realityRenderer.entities.append(cameraEntity)
realityRenderer.activeCamera = cameraEntity
Virtual objects, which should be seen in the camera frames, are clipped out by the camera transform.
If I use deviceAnchor.originFromAnchorTransform as the camera transform, virtual objects can be rendered on camera frames at wrong positions (I think it is because the camera extrinsics isn't used to adjust the camera to the correct position).
My question is how to use the camera extrinsic matrix for this purpose?
Does the camera extrinsics point to a similar orientation of the device anchor with some minor rotation and postion change? Here is an extrinsics from a camera frame. It seems that the direction of Y-axis and Z-axis are flipped by the extrinsics. So the camera is point to a wrong direction.
simd_float4x4([[0.9914258, 0.012555369, -0.13006608, 0.0], // X-axis
[-0.0009778949, -0.9946325, -0.10346654, 0.0], // Y-axis
[-0.13066702, 0.10270659, -0.98609203, 0.0], // Z-axis
[0.024519, -0.019568002, -0.058280986, 1.0]]) // translation
Subject: Combining ARKit Face Tracking with High-Resolution AVCapture and Perspective Rendering on Front Camera
Message:
Hello Apple Developer Community,
We’re developing an application using the front camera that requires both real-time ARKit face tracking/guidance and the capture of high-resolution still images via AVCaptureSession. Our goal is to leverage ARKit’s depth and face data to render a captured image from another perspective post-capture, maintaining high image quality.
Our Approach:
Real-Time ARKit Guidance:
Utilize ARKit (e.g., ARFaceTrackingConfiguration) for continuous face tracking, depth, and scene understanding to guide the user in real time.
High-Resolution Capture Transition:
At the moment of capture, we plan to pause the ARKit session and switch to an AVCaptureSession to take a high-resolution image.
We assume that for a front-facing image, the subject’s face is directly front-on, and the relative pose between the face and camera remains the same during the transition. The only variation we expect is a change in distance.
Our intention is to minimize the delay between the last ARKit frame and the high-res capture to maintain temporal consistency, assuming that aside from distance, the face-camera relative pose remains unchanged.
Post-Processing Perspective Rendering:
Using the last ARKit face data (depth, pose, and landmarks) along with the high-resolution 2D image, we aim to render the scene from another perspective.
We want to correct the perspective of the 2D image using SceneKit or RealityKit, leveraging the collected ARKit scene information to achieve a natural, high-quality rendering from a different viewpoint.
The rendering should match the quality of a normally captured high-resolution image, adjusting for the difference in distance while using the stored ARKit data to correct perspective.
Our Questions:
Session Transition Best Practices:
What are the recommended best practices to seamlessly pause ARKit and switch to a high-resolution AVCapture session on the front camera
How can we minimize user movement or other issues during this brief transition, given our assumption that the face-camera pose remains largely consistent except for distance changes?
Data Integration for Perspective Rendering:
How can we effectively integrate stored ARKit face, depth, and pose data with the high-res image to perform accurate perspective correction or rendering from another viewpoint?
Given that we assume the relative pose is constant except for distance, are there strategies or APIs to leverage this assumption for simplifying the perspective transformation?
Perspective Correction with SceneKit/RealityKit:
What techniques or workflows using SceneKit or RealityKit are recommended for correcting the perspective of a captured 2D image based on ARKit scene data?
How can we use these frameworks to render the high-resolution image from an alternative perspective, while maintaining image quality and fidelity?
4. Pitfalls and Guidelines:
What common pitfalls should we be aware of when combining ARKit tracking data with high-res capture and post-processing for perspective rendering?
Are there performance considerations, recommended thresholds for acceptable temporal consistency, or validation techniques to ensure the ARKit data remains applicable at the moment of high-res capture?
We appreciate any advice, sample code references, or documentation pointers that could assist us in implementing this workflow effectively.
Thank you!
I have recently started testing ARKit on an iPhone 16 Pro and I have noticed that the AutoFocus reaction on this device is much slower than other devices. For example, if I point the camera to a close object AutoFocus takes 4-5 seconds to stabilize, the focal length is adjusted very very slowly. In some cases (although this is rare) AutoFocus seems almost stuck and requires a bit of device movement to trigger.
This is quite problematic when using some ARKit features like Image and Object detection as the detection algorithms struggle with out-of-focus images.
This problem is limited to ARKit. AutoFocus is significantly more responsive when the standard AVFoundation Camera API is used.
This behavior is easy to reproduce with any of the ARKit samples like https://developer.apple.com/documentation/arkit/arkit_in_ios/content_anchors/tracking_and_visualizing_planes
Is anybody else experiencing this problem?
PLATFORM AND VERSION
Vision OS
Development environment: Xcode 16.2, macOS 15.2
Run-time configuration: visionOS 2.3 (On Real Device, Not simulator)
Please someone confirm I'm not crazy and this issue is actually out of my control.
Spent hours trying to fix my app and running profiles because thought it was an issue related to my apps performance. Finally considered chance it was issue with API itself and made sample app to isolate problem, and it still existed in it. The issue is when a model entity moves around in a full space that was launched when the system environment immersion was turned up before opening it, the entities looks very choppy as they move around. If you take off the headset while still in the space, and put it back on, this fixes it and then they move smoothly as they should. In addition, you can also leave the space, and then turn the system environment immersion all the way down before launching the full space again, this will also make the entity moves smoothly as it should. If you launch a mixed immersion style instead of a full immersion style, this issue never arrises. The issue only arrises if you launch the space with either a full style, or progressive style, while the system immersion level is turned on.
STEPS TO REPRODUCE
https://github.com/nathan-707/ChoppyEntitySample
Open my test project, its a small, modified vision os project template that shows it clearly.
otherwise:
create immersive space with either full or progressive immersion style.
setup a entity in kinematic mode, apply a velocity to it to make it pass over your head when the space appears.
if you opened the space while the Apple Vision Pros system environment was turned up, the entity will look choppy.
if you take the headset off while in the same space, and put it back on, it will fix the issue and it will look smooth.
alternatively if you open the space with the system immersion environment all the way down, you will also not run into the issue. Again, issue also does not happen if space launched is in mixed style.
We’re using the enterprise API for spatial barcode/QR code scanning in the Vision Pro app, but we often get invalid values for the barcode anchor from the API, leading to jittery barcode positions in the UI. The code we’re using is attached below.
import SwiftUI
import RealityKit
import ARKit
import Combine
struct ImmersiveView: View {
@State private var arkitSession = ARKitSession()
@State private var root = Entity()
@State private var fadeCompleteSubscriptions: Set = []
var body: some View {
RealityView { content in
content.add(root)
}
.task {
// Check if barcode detection is supported; otherwise handle this case.
guard BarcodeDetectionProvider.isSupported else { return }
// Specify the symbologies you want to detect.
let barcodeDetection = BarcodeDetectionProvider(symbologies: [.code128, .qr, .upce, .ean13, .ean8])
do {
try await arkitSession.requestAuthorization(for: [.worldSensing])
try await arkitSession.run([barcodeDetection])
print("Barcode scanning started")
for await update in barcodeDetection.anchorUpdates where update.event == .added {
let anchor = update.anchor
// Play an animation to indicate the system detected a barcode.
playAnimation(for: anchor)
// Use the anchor's decoded contents and symbology to take action.
print(
"""
Payload: \(anchor.payloadString ?? "")
Symbology: \(anchor.symbology)
""")
}
} catch {
// Handle the error.
print(error)
}
}
}
// Define this function in ImmersiveView.
func playAnimation(for anchor: BarcodeAnchor) {
guard let scene = root.scene else { return }
// Create a plane sized to match the barcode.
let extent = anchor.extent
let entity = ModelEntity(mesh: .generatePlane(width: extent.x, depth: extent.z), materials: [UnlitMaterial(color: .green)])
entity.components.set(OpacityComponent(opacity: 0))
// Position the plane over the barcode.
entity.transform = Transform(matrix: anchor.originFromAnchorTransform)
root.addChild(entity)
// Fade the plane in and out.
do {
let duration = 0.5
let fadeIn = try AnimationResource.generate(with: FromToByAnimation<Float>(
from: 0,
to: 1.0,
duration: duration,
isAdditive: true,
bindTarget: .opacity)
)
let fadeOut = try AnimationResource.generate(with: FromToByAnimation<Float>(
from: 1.0,
to: 0,
duration: duration,
isAdditive: true,
bindTarget: .opacity))
let fadeAnimation = try AnimationResource.sequence(with: [fadeIn, fadeOut])
_ = scene.subscribe(to: AnimationEvents.PlaybackCompleted.self, on: entity, { _ in
// Remove the plane after the animation completes.
entity.removeFromParent()
}).store(in: &fadeCompleteSubscriptions)
entity.playAnimation(fadeAnimation)
} catch {
print("Error")
}
}
}
I can add a WorldAnchor to a WorldTrackingProvider. The next time I start my app, the WorldAnchor is added back, and then is immediately removed:
dataProviderStateChanged(dataProviders: [WorldTrackingProvider(0x0000000300bc8370, state: running)], newState: running, error: nil)
AnchorUpdate(event: added, timestamp: 43025.248134708, anchor: WorldAnchor(id: C0A1AE95-F156-45F5-9030-895CAABF16E9, isTracked: true, originFromAnchorTransform: <translation=(0.048458 0.000108 -0.317565) rotation=(0.00° 15.44° -0.00°)>))
AnchorUpdate(event: removed, timestamp: 43025.348131208, anchor: WorldAnchor(id: C0A1AE95-F156-45F5-9030-895CAABF16E9, isTracked: false, originFromAnchorTransform: <translation=(0.000000 0.000000 0.000000) rotation=(-0.00° 0.00° 0.00°)>))
It always leaves me with zero anchors in .allAnchors...the ARKitSession is still active at this point
Topic:
Spatial Computing
SubTopic:
ARKit
I use ARKit to build an app, scan rooms to collect the spatial data of objects and re-construct the 3D scene.
the problem is I found the depth map values captured in ARFrame significantly deviate from the real distances, even nonlinearly, for the distances below 1.5m, values are basically correct, but beyond 1.5m, they are smaller than real values. for example read 1.9m from the generated depthmap.tiff, but real distance is 3 meters.
below is my code of generating tiff file to record depth map data:
Generated TIFF file (captured from ARKit):
as shown above, the maximum distance is around 1.9m, but real distance to that wall is more than 3 meters, and also you can see, the depth map picture captured in ARKit is quite blurry, particularly at far distance (> 2.0m), almost smeared out.
Generated TIFF file (captured from AVFoundation):
In comparison, the depth map captured from traditional AVFoundation and with the same hardware device is much clear, the values seem not in meter unit though.