I hit a very similar issue while building ambient-voice — a real-time speech-to-text macOS app using SpeechAnalyzer. AVAudioEngine.inputNode.installTap() worked fine with built-in mics but silently failed with Bluetooth devices (the tap callback never fired). The root cause is similar to yours: audio session resource conflicts. Our fix was switching from AVAudioEngine to AVCaptureSession. The captureOutput(_:didOutput:from:) delegate fires reliably regardless of audio device state or competing audio sessions. The tradeoff is you get CMSampleBuffer instead of AVAudioPCMBuffer, so you need a conversion step — but it is straightforward. For your FaceTime case specifically, AVCaptureSession with .mixWithOthers category option should let you capture mic input without conflicting with the active call audio session. We documented all the audio pitfalls we hit on macOS 26 in our forum post: https://developer.apple.com/forums/thread/819525 The project is open source: https://github.com/Marvinngg/ambient-voice
Topic:
Media Technologies
SubTopic:
General
Tags: