About the DockKit API in iOS 18

Question

Created Aug ’24

Replies 7

Boosts 0

Participants 2

We are experimenting with the DockKit API in iOS 18. However, we are unable to retrieve the speakingConfidence, lookingAtCameraConfidence, and saliencyRank for the person being tracked. We are able to get the rect and identifier. Has anyone been able to retrieve speakingConfidence, lookingAtCameraConfidence, and saliencyRank?

Boost

Answer 1

DTS Engineer OP

Apple

Aug ’24

Can you share code showing how you've configured things and how you're retrieving TrackedPerson?

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

0

Answer 2

Yuta43 OP

Aug ’24

Thank you for your reply ! Here is my code snippet.

https://gist.github.com/NAOYA-MAEDA-DEV/f521f37b2e2208cdfc23da96800e93f9

0

Answer 3

DTS Engineer OP

Apple

Aug ’24

Huh. I don't see any obvious issue with that code. Does it otherwise "work"? That is, are you getting a steady series of trackingStates-> trackedPersons? And is the rect updating properly?

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

0

Answer 4

Yuta43 OP

Aug ’24

Upon running the operational tests, I am able to retrieve identifier and rect, but saliencyRank, lookingAtCameraConfidence, and speakingConfidence consistently remain Optional(0.0). I am testing this with only myself as the subject.

Test Result

identifier: 3472BEBF-9BF3-4057-9734-37CBCFBF18F6

rect: (0.390560653609437, 0.2636592531080226, 0.17053699440298703, 0.3014791317780524)

saliencyRank: Optional(0.0)

lookingAtCameraConfidence: Optional(0.0)

speakingConfidence: Optional(0.0)

0

Answer 5

DTS Engineer OP

Apple

Aug ’24

Upon running the operational tests, I am able to retrieve identifier and rect, but saliencyRank, lookingAtCameraConfidence, and speakingConfidence consistently remain Optional(0.0). I am testing this with only myself as the subject.

Clarifying here, you're specifically getting a steady "steam" of TrackedPerson structs that include rect updates matching your head motion (I also assume you're actually talking)? My concern here is that "speakingConfidence/lookingAtCameraConfidence" are "lagging" indicators*, so if you're only processing the "first" update that would explain why you're getting 0.

*Meaning, the system has to determine the rect location before it determines secondary properties like speakingConfidence/lookingAtCameraConfidence, not that they're necessarily "slow". This isn't guaranteed/reliable, but I suspect you'll often/always get "rect only" as the first update simply because the system is delivering "what it can" as quickly as possible.

If you're getting that steady update stream, then please file a bug that includes the test project you're using, then post the bug number here once it's filed.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

0

Answer 6

Yuta43 OP

Aug ’24

Thank you for your response. Do the values of 'speakingConfidence' and 'lookingAtCameraConfidence' change even when there is only one subject in the video frame? Or is it necessary to have two or more people conversing? I am conducting the tests with only myself in the video frame.

0

Answer 7

DTS Engineer OP

Apple

Aug ’24

Thank you for your response. Do the values of 'speakingConfidence' and 'lookingAtCameraConfidence' change even when there is only one subject in the video frame? Or is it necessary to have two or more people conversing? I am conducting the tests with only myself in the video frame.

It's certainly worth testing* just in case that's a factor here, but yes, I'd expect them to work even when there is only one subject.

*I'd test it myself but, ironically, I happen to be waiting for my new test dock to arrive.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

0