ARKit ARCamera.intrinsics changes between frames: what can affect fx/fy/cx/cy?

Hello, I am doing a personal research project around ARKit camera frames and camera calibration data. For each accepted ARFrame, I export:

ARFrame.timestamp ARFrame.camera.imageResolution ARFrame.camera.intrinsics ARFrame.capturedImage

The target video mode is 4K / 30 FPS.

In my test captures, I noticed two things: ARFrame.camera.intrinsics is not constant across the recording.

Some neighboring frames from ARFrame.capturedImage have noticeably different sharpness: one frame can look sharp, while the next frame can look blurred. Regarding intrinsics, I observe small per-frame changes in fx/fy/cx/cy.

For example, in one session:

fx/fy changed by approximately 36 px

cx changed by approximately 1.5 px

cy changed by approximately 2.3 px

I understand the standard meaning of the intrinsic matrix:

fx / fy = focal length in pixels

cx / cy = principal point in pixels relative to the image/reference frame

My main question is not about the definition of these values, but about how to correctly interpret their changes over time. In particular, I am interested in cx/cy, because the principal point is important for measurements. I would like to understand whether a change in cx/cy corresponds to a meaningful update of the camera model for the current frame, and what physical or camera-pipeline process may cause this change.

Questions about ARFrame.camera.intrinsics:

Are ARFrame.camera.intrinsics values expected to change between frames during a single ARWorldTrackingConfiguration session?

If fx/fy changes over time, can this reflect autofocus / focus breathing, internal camera calibration updates, digital crop/scaling, stabilization-related mapping, or other camera pipeline changes?

If cx/cy changes over time, should I interpret this as an updated principal point for the current ARFrame image/reference frame?

Does ARKit update cx/cy to account for any internal crop, scaling, stabilization, virtual camera behavior, lens movement, or calibration changes?

Is there a known physical interpretation for small cx/cy changes reported by ARFrame.camera.intrinsics, or should these values simply be treated as ARKit’s best current camera model for that frame?

In an ARKit-first pipeline where focus/exposure/stabilization are managed by ARKit/camera pipeline, is the recommended approach to store ARFrame.camera.intrinsics per frame together with ARFrame.timestamp and ARFrame.camera.imageResolution?

Is there any public API that exposes the reason for intrinsics changes, such as focus position, stabilization transform, crop transform, active camera constituent, lens movement, or per-frame calibration update reason?

I also have a related question about ARFrame.capturedImage frame quality.

In some recordings, neighboring frames can have noticeably different sharpness. For example, one frame can look sharp, while the next frame can be visibly blurred, even though the scene and camera movement are continuous.

Questions about frame sharpness / camera pipeline behavior: Is frame-to-frame sharpness variation expected when using ARFrame.capturedImage as the video source?

Can ARKit change focus, exposure duration, ISO, white balance, crop/scaling, or other camera pipeline parameters between frames during an ARWorldTrackingConfiguration session?

Is there any public API to inspect per-frame exposure duration, ISO, focus position, lens position, stabilization state, crop/scaling transform, or other capture parameters for ARFrame.capturedImage?

If some frames are sharp and neighboring frames are blurred, should this usually be interpreted as motion blur / exposure behavior / autofocus behavior, or can ARKit internal processing also affect this?

Is there any recommended way in an ARKit-first pipeline to reduce frame-to-frame sharpness inconsistency, or is the only reliable approach to use an AVFoundation-first capture pipeline with locked focus/exposure/ISO/white balance?

The main goal of my research is to understand how to correctly interpret ARKit-provided intrinsics over time and how much of the camera pipeline behavior is observable through public ARKit APIs.

Since fx/fy/cx/cy are important for measurements, I want to know whether treating ARFrame.camera.intrinsics as the Apple-delivered per-frame camera model is the safest approach.

Thank you.

ARKit ARCamera.intrinsics changes between frames: what can affect fx/fy/cx/cy?
 
 
Q