I'm trying to track identified objects in realtime with bounding rects, with no 3D integration, but still has poor update performance. I'm trying to understand how to optimize frame updates. (I'm a new programmer) Using: Foundation, AVFoundation, ARKit, CoreVideo, Vision, CoreImage, CoreVideo with YOLOE-11s object detection currently throttled to 2fps. (target iOS, testing on 16pro)
- YOLOE-11S CoreML model detects objects with class labels + bounding boxes
- Labels are matched against ObjectCatalog.json for relevance
- Matched objects are promoted from blue (detected) to green (identified)
Log warnings: ARSession <0x110afdb80>: The delegate of ARSession is retaining 13 ARFrames. The camera will stop delivering camera images if the delegate keeps holding on to too many ARFrames. This could be a threading or memory management issue in the delegate and should be fixed.
Skipping integration due to poor slam at time: 619447.208339 vio_initialized(1) map_size(0) tracking_state_is_nominal(0) is_3dof(0) reinitialize_attempts(6) slam_mode(RegularSLAM)
I'm trying to track identified objects in realtime with bounding rects, with no 3D integration, but still has poor update performance.
If the objects to be tracked are known beforehand then you can use ML-based object tracking in iOS 27 to track these objects: https://developer.apple.com/documentation/visionos/implementing-object-tracking-in-your-app