Testing Foundation Models framework with a health-focused recipe generation app. The on-device approach is appealing but performance is rough. Taking 20+ seconds just to get recipe name and description. Same content from Claude API: 4 seconds.
I know it's beta and on-device has different tradeoffs, but this is approaching unusable territory for real-time user experience. The streaming helps psychologically but doesn't mask the underlying latency.The privacy/cost benefits are compelling but not if users abandon the feature before it completes.
Anyone else seeing similar performance? Is this expected for beta, or are there optimization techniques I'm missing?