ARKit Eye Tracking Calibration Issues - Word-Level Reading Tracking Feasibility

Question

Created Oct ’25

Replies 0

Boosts 0

Participants 1

Hi Apple Developer Community,

I'm developing an eye-tracking application using ARKit's ARFaceTrackingConfiguration and ARFaceAnchor.blendShapes for gaze detection using Xcode. I'm experiencing several calibration and accuracy issues and would appreciate insights from the community.

Current Implementation

Using ARFaceAnchor.blendShapes (.eyeLookUpLeft, .eyeLookDownLeft, .eyeLookInLeft, .eyeLookOutLeft, etc.)
Implementing custom sensitivity curves and smoothing algorithms
Applying baseline correction and coordinate mapping
Using quadratic regression for calibration point mapping

Issues I'm Facing

1. Calibration Mismatch

Red dot position doesn't align with where I'm actually looking
Significant offset between intended gaze point and actual cursor position
Calibration seems to drift or become inaccurate over time

2. Extreme Eye Movement Requirements

Need to make exaggerated eye movements to reach screen edges/corners
Natural eye movements don't translate to proportional cursor movement
Difficulty reaching certain screen regions even with calibration

3. Sensitivity and Stability Issues

Cursor jitters or jumps around when looking at center
Too much sensitivity to micro-movements
Inconsistent behavior between calibration and normal operation

4. I also noticed that tracking on calibration screen as well as tracking on reading screen works better as expected when head movement is there, but I do not want much head movement. I want tracking with normal eye movement while reading an Ebook.

Primary Question: Word-Level Eye Tracking Feasibility

Is word-level eye tracking (tracking gaze as users read through individual words in an ebook) technically feasible with current iPhone/iPad hardware?

I understand that Apple's built-in eye tracking is primarily an accessibility feature for UI navigation. However, I'm wondering if the TrueDepth camera and ARKit's eye tracking capabilities are sufficient for:

Tracking natural reading patterns (left-to-right, line-by-line progression)
Detecting which specific words a user is looking at
Maintaining accuracy for sustained reading sessions (15-30 minutes)
Working reliably across different users and lighting conditions

Questions for the Community

Hardware Limitations: Are iPhone/iPad TrueDepth cameras capable of the precision needed for word-level tracking, or is this beyond current hardware capabilities?
Calibration Best Practices: What calibration strategies have worked best for accurate gaze mapping? How many calibration points are typically needed?
Reading-Specific Challenges: Are there particular challenges when tracking reading behavior vs. general gaze tracking?
Alternative Approaches: Are there better approaches than ARKit blend shapes for this use case?

Current Setup

Devices: iPhone 14 Pro
iOS Version: iOS 18.3
ARKit Version: Latest available

Any insights, experiences, or technical guidance would be greatly appreciated. I'm particularly interested in hearing from developers who have worked on similar eye tracking applications or have experience with the limitations and capabilities of ARKit's eye tracking features.

Thank you for your time and expertise!

Boost