Increased and Mismatched Audio Buffer Sizes on iOS 18 when Sound Recognition or Vocal Shortcuts Is Enabled

Description

As of iOS 18, AVAudioSession.setPreferredIOBufferDuration ignores the requested buffer size when Sound Recognition or Vocal Shortcuts is enabled. This results in 1) much larger buffer sizes and 2) mismatched buffer sizes between input and output buffers, which causes ‘glitchy’ audio and increased latency.

Additionally, when this issue occurs AVAudioSession.setPreferredIOBufferDuration continues to return ‘true’ and no error is produced.

Steps to Reproduce:

  1. Enable Vocal Shortcuts on a device running iOS 18. Enable at least one shortcut (e.g. Control Center).
  2. Open or clone the example project (https://github.com/cwalo/SoundRecognitionBug)
  3. Build and install the example project
  4. Attach a headset and launch the application
  5. Observe console logs showing
    • a requested buffer size of 0.005805 (256 samples @ 48k)
    • an actual buffer size of 0.023220 (1104 samples @48k - this is regularly the resulting buffer size in all of our tests)
  6. Quit the app and detach the headset. Enable mutesOutput in AudioSystem.mm (to avoid feedback)
  7. Launch the application
  8. Observe
    • Same result from step 4
    • Mismatched hardware buffer size of 1104 and recorded frame count of 1024
    • Mismatched playbackCount and recordCount
  9. Quit the app and disable vocal shortcuts
  10. Launch the app
  11. Observe IOBufferDuration matching the requested duration and matched buffer sizes (expected behavior)

Expected results:

  • Requested IOBufferDuration is respected or AVAudioSession returns false or error is produced
  • Input and output buffer sizes match

Device(s): iPhone 11 Pro, iPad Pro

OS: iOS 18.0.1

Environment: Xcode 16.1

FB: FB15715421

Related to: https://forums.developer.apple.com/forums/thread/765477

Increased and Mismatched Audio Buffer Sizes on iOS 18 when Sound Recognition or Vocal Shortcuts Is Enabled
 
 
Q