We are experimenting with the DockKit API in iOS 18. However, we are unable to retrieve the speakingConfidence, lookingAtCameraConfidence, and saliencyRank for the person being tracked. We are able to get the rect and identifier. Has anyone been able to retrieve speakingConfidence, lookingAtCameraConfidence, and saliencyRank?
About the DockKit API in iOS 18
Can you share code showing how you've configured things and how you're retrieving TrackedPerson?
__
Kevin Elliott
DTS Engineer, CoreOS/Hardware
Thank you for your reply ! Here is my code snippet.
https://gist.github.com/NAOYA-MAEDA-DEV/f521f37b2e2208cdfc23da96800e93f9
Huh. I don't see any obvious issue with that code. Does it otherwise "work"? That is, are you getting a steady series of trackingStates-> trackedPersons? And is the rect updating properly?
__
Kevin Elliott
DTS Engineer, CoreOS/Hardware
Upon running the operational tests, I am able to retrieve identifier and rect, but saliencyRank, lookingAtCameraConfidence, and speakingConfidence consistently remain Optional(0.0). I am testing this with only myself as the subject.
Test Result
identifier: 3472BEBF-9BF3-4057-9734-37CBCFBF18F6
rect: (0.390560653609437, 0.2636592531080226, 0.17053699440298703, 0.3014791317780524)
saliencyRank: Optional(0.0)
lookingAtCameraConfidence: Optional(0.0)
speakingConfidence: Optional(0.0)
Upon running the operational tests, I am able to retrieve identifier and rect, but saliencyRank, lookingAtCameraConfidence, and speakingConfidence consistently remain Optional(0.0). I am testing this with only myself as the subject.
Clarifying here, you're specifically getting a steady "steam" of TrackedPerson structs that include rect updates matching your head motion (I also assume you're actually talking)? My concern here is that "speakingConfidence/lookingAtCameraConfidence" are "lagging" indicators*, so if you're only processing the "first" update that would explain why you're getting 0.
*Meaning, the system has to determine the rect location before it determines secondary properties like speakingConfidence/lookingAtCameraConfidence, not that they're necessarily "slow". This isn't guaranteed/reliable, but I suspect you'll often/always get "rect only" as the first update simply because the system is delivering "what it can" as quickly as possible.
If you're getting that steady update stream, then please file a bug that includes the test project you're using, then post the bug number here once it's filed.
__
Kevin Elliott
DTS Engineer, CoreOS/Hardware
Thank you for your response. Do the values of 'speakingConfidence' and 'lookingAtCameraConfidence' change even when there is only one subject in the video frame? Or is it necessary to have two or more people conversing? I am conducting the tests with only myself in the video frame.
Thank you for your response. Do the values of 'speakingConfidence' and 'lookingAtCameraConfidence' change even when there is only one subject in the video frame? Or is it necessary to have two or more people conversing? I am conducting the tests with only myself in the video frame.
It's certainly worth testing* just in case that's a factor here, but yes, I'd expect them to work even when there is only one subject.
*I'd test it myself but, ironically, I happen to be waiting for my new test dock to arrive.
__
Kevin Elliott
DTS Engineer, CoreOS/Hardware