Accessing SLAM internal data

Hi all,


Are they any plans to give developers access to the SLAM internal world tracking data?


I know that from an ARFrame object I have access to an ARPointCloud, but this is very basic information - I would like deeper access to other things that the world-tracking algorithm has access to, for example persistent IDs for the points in the ARPointCloud, and ideally the attached textures or feature information that is stored with each point from different perspectives (of course this depends on how SLAM is actually implemented).


The most obvious use case I can consider for this is doing further 3D object reconstruction to improve the AR experience (for example, building a mesh from the point cloud to represent the world in 3D, if just for masking/occlusion). I'm sure there are plenty of other use cases as well.


At the very least, it would be great to have persistent IDs available along with the point cloud, so that I could build up my own copy of the world feature points in 3D.


Thanks

Damian

>I would like - would be great to have - so that I could


Interesting feedback, thanks.


Sounds like good points to make via a feature request using the bug reporter.

Seeing as how ARKit is a new API, I'd expect more advanced features to follow in later releases. However, I highly doubt we'll ever get access to the raw keypoint descriptors as this would (1) expose details about the underlying SLAM implementation and (2) it would be useful only for a handful of researchers and/or SDK developers. Instead, what I'm expecting we'll see in "ARKit 2" is some form of high-level API for keypoint persistence, such as Project Tango's "Area Learning" feature.

Like I wrote in my post, if we had some sort of persistent ID for each point we could build up a mesh roughly representing the surface of the real world, to use in a game environment. The advantages of being able to do this should be obvious, but to spell it out: it’s one thing to be able to project a game board onto a flat surface in the real world, it’s quite another to be able to use the entire room as a game field. (Imagine snipers hiding behind pillows, or a racer where the cars drive around obstacles.) Yes, we could wait for Apple to give this to us. Or, they could just provide a little bit more detail, and we could do it, or a rough approximation of it, for ourselves. Really it shouldn’t be much work to assign my own unique id to points consistent between frames (including tracking dropped or merged points) but Apple already have this internally, and revealing it wouldn’t reveal anything about the tracking algorithm. (Besides which, SLAM algorithms are all well-known by now.)

I'm not representing Apple, so all of this is speculation anyways. In my opinion, the most realistic scenario is for the ARKit team to provide a high-level API further down the road that will output a triangulated mesh (similar to other SDKs, such as the Structure Sensor SDK from Occipital). I also disagree that "SLAM algorithms are all well-known by known". While that might be true for the general principles behind SLAM, visual odometry with multi-sensor fusion (as its used in ARKit) is an active area of research with various subtle and not-so-subtle algorithmic differences.

Accessing SLAM internal data
 
 
Q