CarPlay: iPhone media does not reliably resume after short SFSpeechRecognizer capture with AVAudioSession record/measurement

Post Title: CarPlay: iPhone-origin media does not reliably resume after short SFSpeechRecognizer capture

Post Body: We are testing short, user-initiated speech recognition in a CarPlay driving-task app.

The capture is started by an explicit button tap, lasts up to about 4 seconds, and uses SFSpeechRecognizer, SFSpeechAudioBufferRecognitionRequest, and AVAudioEngine. There is no wake word, no continuous recording, and no background listening.

Test setup:

  • iPhone: iPhone 17
  • iOS: 26.5.2
  • CarPlay: wireless
  • Vehicle / head unit: Volkswagen Discover Media
  • Also tested in a second vehicle with wireless CarPlay
  • Media tested: Apple Music and online radio through CarPlay
  • Also tested: vehicle DAB+ / normal car radio

Observed behavior:

  • Native speech recognition succeeds.
  • Speech is recognized correctly.
  • There is no loud playback through the vehicle speakers.
  • During the short capture, the audio route changes to CarPlay / CarAudio input.
  • After capture, vehicle DAB+ / normal car radio resumes correctly.
  • iPhone-origin media, including Apple Music and online radio apps playing through CarPlay, does not reliably resume.
  • In Apple Music, playback may appear to advance or change tracks, but no audio is actually played until the user intervenes.
  • Online radio through CarPlay remains stopped after capture.
  • This behavior was reproduced in two vehicles with wireless CarPlay.

Current AVAudioSession configuration during capture:

category: AVAudioSession.Category.record mode: AVAudioSession.Mode.measurement options: [.mixWithOthers]

During capture, diagnostics show approximately:

category: AVAudioSessionCategoryRecord mode: AVAudioSessionModeMeasurement inputNumberOfChannels: 1 outputNumberOfChannels: 0 currentRoute.inputs: CarPlay / CarAudio currentRoute.outputs: none

After capture, the app stops and releases audio resources:

  • Stops AVAudioEngine
  • Removes the input tap
  • Ends the recognition request
  • Cancels the recognition task
  • Calls setActive(false, options: [.notifyOthersOnDeactivation])
  • Restores a passive configuration using .ambient / .default

Relevant Swift extract:

try session.setCategory(
    .record,
    mode: .measurement,
    options: [.mixWithOthers]
)

try session.setActive(true, options: [])

if audioEngine.isRunning {
    audioEngine.stop()
}

audioEngine.inputNode.removeTap(onBus: 0)

recognitionRequest?.endAudio()
recognitionRequest = nil

recognitionTask?.cancel()
recognitionTask = nil

try session.setActive(
    false,
    options: [.notifyOthersOnDeactivation]
)

try session.setCategory(.ambient, mode: .default, options: [])

Primary question:

Is there a supported AVAudioSession / CarPlay / Speech configuration for short, user-initiated speech recognition in a CarPlay driving-task app that reliably allows previous iPhone-origin CarPlay media playback to resume after capture?

Additional question:

Is it expected that vehicle DAB+ / normal radio resumes correctly while iPhone-origin CarPlay media does not reliably resume after the same short recording session?

Any guidance on the supported approach would be appreciated.

Best regards, Guido

CarPlay: iPhone media does not reliably resume after short SFSpeechRecognizer capture with AVAudioSession record/measurement
 
 
Q