Object recognition and tracking on visionOS

Hello!

I would like to develop a visionOS application that tracks a single object in a user's environment. Skimming through the documentation I found out that this feature is currently unsupported in ARKit (we can only recognize images). But it seems it should be doable by combining CoreML and Vision frameworks. So I have a few questions:

  1. Is it the best approach or is there a simpler solution?
  2. What is the best way to train a CoreML model without access to the device? Will videos recorded by iPhone 15 be enough?

Thank you in advance for all the answers.

Replies

There is no developer access to the camera stream, so you wouldn’t have any frames to provide to your CoreML model. Please file an enhancement request for the functionality you’d like to see using Feedback Assistant.