Image understanding to on-device model

I can’t seem to find a way to include an image when prompting the new on-device model in Xcode, even though Apple explicitly states that the model was trained and tested with image data (https://machinelearning.apple.com/research/apple-foundation-models-2025-updates).

Has anyone managed to get this working, or are VLM-style capabilities simply not exposed yet?

Image understanding to on-device model
 
 
Q