Image understanding to on-device model

I can’t seem to find a way to include an image when prompting the new on-device model in Xcode, even though Apple explicitly states that the model was trained and tested with image data (https://machinelearning.apple.com/research/apple-foundation-models-2025-updates).

Has anyone managed to get this working, or are VLM-style capabilities simply not exposed yet?

Hi @1729k, currently Foundation Models does not support images as input. But depending on your app's needs you could consider pairing it with Vision, a computer vision framework that also runs on-device.

Best,

-J

Image understanding to on-device model
 
 
Q