Vision API

Am I correct to understand that visionOS 27 brings the Vision API to the headset? What are the privacy restrictions on its use?

Answered by Vision Pro Engineer in 891567022

Hey @aharriscrowne,

Visual Intelligence does extend the capabilities of your app. To integrate your app with visual intelligence and include your app’s content in search results, use the Visual Intelligence framework and App Intents.

On visionOS, Siri also has visual intelligence about the world around the user. Siri has the ability to see what the user sees. Checkout Adopting App Intents to support system experiences and the AppIntents framework to make your app's actions available to the system.

If there are additionally features of Visual Intelligence or specific app intents that you would like to see on visionOS please file an enhancement request. Once you file the request, please post the FB number here.

If you're not familiar with how to file enhancement requests, take a look at Bug Reporting: How and Why?

Thanks,
Michael

Hi @aharriscrowne

The Vision framework has been available on visionOS since visionOS 1, and visionOS 27 does not change this or remove the need for camera entitlements to run Vision on the headset's cameras. A few things to consider about where the images you pass to Vision come from:

  • To get frames from the main camera, you need an enterprise license and entitlement. See Accessing the main camera.
  • To get frames from a USB-C camera connected to Vision Pro, no entitlement or license is required. See Displaying video from connected devices.
  • To run Vision on images a person explicitly hands your app no special entitlement is required.

Finally, ARKit in visionOS addresses many tasks people commonly reach for Vision to do, without needing camera access. Examples include hand tracking, plane detection, scene reconstruction / mesh, image tracking, object tracking, and world / room tracking. Depending on your use case, one of these may let you skip the camera entirely.

Thanks for the feedback; I think I misunderstood and also misspoke. Yes Vision, as you say, has been available to run on static images, but not on objects in the room; I.e., I can’t scan a barcode or text in the space around me, or see a person’s 3d joints superimposed on them. I think when I saw the announcement about Visual Intelligence, some described it as bringing Vision to the AVP. Does Visual Intelligence on the AVP extend my capabilities as a developer?

Accepted Answer

Hey @aharriscrowne,

Visual Intelligence does extend the capabilities of your app. To integrate your app with visual intelligence and include your app’s content in search results, use the Visual Intelligence framework and App Intents.

On visionOS, Siri also has visual intelligence about the world around the user. Siri has the ability to see what the user sees. Checkout Adopting App Intents to support system experiences and the AppIntents framework to make your app's actions available to the system.

If there are additionally features of Visual Intelligence or specific app intents that you would like to see on visionOS please file an enhancement request. Once you file the request, please post the FB number here.

If you're not familiar with how to file enhancement requests, take a look at Bug Reporting: How and Why?

Thanks,
Michael

Vision API
 
 
Q