MPS SDPA Attention Kernel Regression on A14-class (M1) in macOS 26.3.1 — Works on A15+ (M2+)

Question

Created Mar ’26

Replies 2

Boosts 0

Participants 1

Summary

Since macOS 26, our Core ML / MPS inference pipeline produces incorrect results on Mac mini M1 (Macmini9,1, A14-class SoC). The same model and code runs correctly on M2 and newer (A15-class and up). The regression appears to be in the Scaled Dot-Product Attention (SDPA) kernel path in the MPS backend.

Environment

Affected	Mac mini M1 — Macmini9,1 (A14-class)
Not affected	M2 and newer (A15-class and up)
Last known good	macOS Sequoia
First broken	macOS 26 (Tahoe) ?
Confirmed broken on	macOS 26.3.1
Framework	Core ML + MPS backend
Language	C++ (via CoreML C++ API)

Description

We ship an audio processing application (VoiceAssist by NoiseWorks) that runs a deep learning model (based on Demucs architecture) via Core ML with the MPS compute unit. On macOS Sequoia this works correctly on all Apple Silicon Macs including M1.

After updating to macOS 26 (Tahoe), inference on M1 Macs fails — either producing garbage output or crashing. The same binary, same .mlpackage, same inputs work correctly on M2+.

Our Apple contact has suggested the root cause is a regression in the A14-specific MPS SDPA attention kernel, which may have broken when the Metal/MPS stack was updated in macOS 26.

The model makes heavy use of attention layers, and the failure correlates precisely with the SDPA path being exercised on A14 hardware.

Steps to Reproduce

Load a Core ML model that uses Scaled Dot-Product Attention (e.g. a transformer or attention-based audio model)
Run inference with MLComputeUnits::cpuAndGPU (MPS active)
Run on Mac mini M1 (Macmini9,1) with macOS 26.3.1
Compare output to the same model running on M2 / macOS Sequoia

Expected: Correct inference output, consistent with M2+ and macOS Sequoia behavior

Actual: Incorrect / corrupted output (or crash), only on A14-class hardware running macOS 26+

Workaround

Forcing MLComputeUnits::cpuOnly bypasses MPS entirely and produces correct output on M1, confirming the issue is in the MPS compute path. This is not acceptable as a shipping workaround due to performance impact.

Additional Notes

The failure is hardware-specific (A14 only) and OS-specific (macOS 26+), pointing to a kernel-level regression rather than a model or app bug
We first became aware of this through a customer report
Happy to provide a symbolicated crash log if helpful
this text was summarized by AI and human verified

Answer 1

nwaudio OP

Mar ’26

I am not able to submit this using the feedback assistant:

There was an error trying to submit your feedback Please try again later

Answer 2

nwaudio OP

Apr ’26

fixed in 26.4