As I've mentioned before our app uses PTT Framework to record and send audio messages. In one of supported by app mode we are using WebRTC.org library for that purpose. Internally WebRTC.org library uses Voice-Processing I/O Unit (kAudioUnitSubType_VoiceProcessingIO subtype) to retrieve audio from mic. According to https://developer.apple.com/documentation/avfaudio/avaudiosession/mode-swift.struct/voicechat using Voice-Processing I/O Unit leads to implicit enabling .voiceChat AVAudioSession mode (i.e. it looks like it's not possible to use Voice-Processing I/O Unit without .voiceChat mode).
And problem is following: when user starts outgoing PTT, PTT Framework plays audio notification, but in case of enabled .voiceChat mode that sound is playing distorted or not playing at all.
Questions:
Is it known issue?
Is there any way to workaround it?
AVAudioSession
RSS for tagUse the AVAudioSession object to communicate to the system how you intend to use audio in your app.
Posts under AVAudioSession tag
90 Posts
Sort by:
Post
Replies
Boosts
Views
Activity
Hello! I'm use AVFoundation for preview video and audio from selected device, and I try use AVAudioEngine for preview audio in real-time, but I can't or I don't understand how select input device? I can hear only my microphone in real-time
So far, I'm using AVCaptureAudioPreviewOutput for in real-time hear audio, but I think has delay.
On iOS works easy with AVAudioEngine, but on macOS bruh...
I am developing an iOS app that needs to play spoken audio on demand from a server, while ducking the audio of background music from another app (e.g., SoundtrackYourBrand or Apple Music). This must work even when the app is in the background, and the server dictates when and what audio is played. Ideally, the message should be played within a minute of the server requesting it.
Current Attempt & Observations
I initially tried using Firebase Cloud Messaging (FCM) silent notifications to send a URL to an audio file, which the app would then play using AVPlayer.
This works consistently when the app is active, but in the background, it only works about 60% of the time.
In cases where it fails, iOS ducks the background music (e.g., from SoundtrackYourBrand) but never plays the spoken audio.
Interestingly, when I play the audio without enabling audio ducking, it seems to work 100% of the time from my limited testing, even in the background.
The app has background modes enabled for Audio, Background Fetch, and Remote Notifications.
Best Approach to Achieve This?
I’d like guidance on the best Apple-compliant approach to reliably play audio on command from the server, even when the app is in the background. Some possible paths:
Ensuring the app remains active in the background – Are there recommended ways to prevent the app from getting suspended, such as background tasks, a special background mode, or a persistent connection to the server?
Alternative triggering mechanisms – Would something like VoIP, Push-to-Talk, or another background service be better suited for this use case?
Built-in iOS speech synthesis (AVSpeechSynthesizer) – If playing external audio is unreliable, would generating speech dynamically from text be a more robust approach?
Streaming audio instead of sending a URL – Could continuous streaming from the server keep the app active and allow playback at the right moment?
I want to ensure the solution is reliable and works 100% of the time when needed. Any recommendations on the best approach for this would be greatly appreciated.
Thank you for your time and guidance.
We have application using PTT Framework to record audio messages when app is backgrounded. Right now we are using AVAudioRecorder for that purpose. And problem is one specific user has frequent issue - recorded audio contains only silence.
I've checked almost everything I can imagine but didn't find any possible reason of issue.
Conditions:
AVAudioRecorder uses following configuration:
[
AVEncoderAudioQualityKey: AVAudioQuality.low.rawValue,
AVFormatIDKey : kAudioFormatMPEG4AAC,
AVNumberOfChannelsKey: 1,
AVSampleRateKey: 16000.0
]
App waits both didBeginTransmitting and didActivate audioSession from PTChannelManager (audio session has playback category at that moment)
App does AVAudioSession category change to playAndRecord
App gets routeChangeNotification with categoryChange and category = playAndRecord
There is no any interruption notifications from AVAudioSession during recording
There is no any error notification from AVAudioRecorder
Any idea what exactly I do wrong? Is there anything else I should check?
Thanks in advance.
P.S. it looks like recording audio with AudioUnit has the same issue, but let's exclude it from question atm for simplicity.
Getting this error in iPhone Portrait Mode with notch.
Currrently using AVQueuePlayer to play more than 30 mp3 files one by one.
All constraint properties are correct but error occures only in Apple iPhone Portrait Mode with notch series. But same code works on same iPhone in Landscape mode.
**But I get this error: **
LoudnessManager.mm:709 unable to open stream for LoudnessManager plist
Type: Error | Timestamp: 2025-02-07 | Process: | Library: AudioToolbox | Subsystem: com.apple.coreaudio | Category: aqme | TID: 0x42754
LoudnessManager.mm:709 unable to open stream for LoudnessManager plist
LoudnessManager.mm:709 unable to open stream for LoudnessManager plist
Timestamp: 2025-02-07 | Library: AudioToolbox | Subsystem: com.apple.coreaudio | Category: aqme
Issue:
Under certain conditions, using CallKit does not automatically enable the microphone.
Steps to Reproduce:
1.Start an outgoing call, then the user manually mutes the audio.
2.Receive a native incoming call, end the current call, then answer the new incoming call.(This order is important.)
3.End the incoming call.
4.Start another outgoing call and observe the microphone; do not manually mute or unmute.
Actual Behavior:
The audio icon indicates that the audio is unmuted, but the microphone remains off, and the small yellow dot in the top status bar (which represents the microphone) does not appear.
Expected Behavior:
The microphone should be on, consistent with the audio icon display, and the small yellow dot should appear in the top status bar.
Device:
iPhone 16 pro & iPhone 15 pro, iOS 18.0+
Can it be reproduced using speakerbox(CallKit Demo)?
YES
So I'm using AVAudioEngine. When playing audio I become the 'now playing' app using MPNowPlayingInfoCenter/MPRemoteCommandCenter APIs.
When configuring MPRemoteCommandCenter I add a play/pause command target via -addTargetWithHandler on the togglePlayPauseCommand property.
Now I also have a play/pause button in my app's UI. When I pause playback from my app's UI (which means I'm the active app, I'm in the foreground), what I do is this:
-I pause the AVAudioPlayerNode I'm using with AVAudioEngine.
I do not, stop, reset, etc. the AVAudioEngine. I only pause the player node. My thought process here is that the user just pressed pause and it is very likely that he will hit 'play' to resume playback in the near future because
My app is in the foreground and the user just hit the pause button.
Now if my app moves to the background and if I receive a memory warning I presume it'd make sense to tear down the engine or pause it. Perhaps I'm wrong about this?
So when I initially hit the play button from my app's UI I also activate my AVAudioSession. I do this in high priority NSOperation since the documentation warns that "we recommend that applications not activate their session from a thread where a long blocking operation will be problematic."
So now I'm playing and I hit pause from my app's UI. Then I quickly bring up the "Now Playing" center and I see I'm the "Now Playing" app but the play-pause button is showing the pause icon instead of the play icon but I'm in the pause state. I do set MPNowPlayingInfoCenter's playbackState to MPNowPlayingPlaybackStatePaused when I pause. Not surprisingly this doesn't work. The documentation states this is for macOS only.
So the only way to get MPRemoteCommandCenter to show the "play" image for the play-pause button is to deactivate my AVAudioSession when I pause playback? Since I change the active state of my audio session in a NSOperation because documentation recommends "we recommend that applications not activate their session from a thread where a long blocking operation will be problematic." the play-pause toggle in the remote command center won't immediately update since I'm doing it on another thread.
IMO it feels kind of inappropriate for a play-pause button to wait on a NSOperation activating the audio session before updating its UI when I already know my play/paused state, it should update right away like the button in my app does. Wouldn't it be nicer to just use MPNowPlayingInfoCenter's playbackState property on iOS too? If I'm no the longer the now playing app/active audio session it doesn't matter since I'm not in the now playing UI, just ignore it?
Also is it recommended that I deactivate my audio session explicitly every time the user pauses audio in my app (when I'm in the foreground)?
Also when I do deactivate the audio session I get an error: AVAudioSessionErrorCodeIsBusy (but the button in the now playing center updates to the proper image). I do this :
-(void)pause
{
[self.playerNode pause];
[self runOperationToDeactivateAudioSession];
// This does nothing on iOS:
MPNowPlayingInfoCenter *nowPlayingCenter = [MPNowPlayingInfoCenter defaultCenter];
nowPlayingCenter.playbackState = MPNowPlayingPlaybackStatePaused;
}
So in -runOperationToDeactivateAudioSession I get the AVAudioSessionErrorCodeIsBusy. According to the documentation
Starting in iOS 8, if the session has running I/Os at the time that deactivation is requested, the session will be deactivated, but the method will return NO and populate the NSError with the code property set to AVAudioSessionErrorCodeIsBusy to indicate the misuse of the API.
So pausing the player node when pausing isn't enough to meet the deactivation criteria. I guess I have to pause or stop the audio engine. I could probably wait until I receive a scene went to background notification or something before deactivating my audio session (which is async, so the button may not update to the correct image in time). This seems like a lot of code to have to write to get a play-pause toggle to update, especially in iPad-multi window scene environment.
What's the recommended approach?
Should I pause the AudioEngine instead of the player node always?
Should I always explicitly deactivate my audio session when the user pauses playback from my app's UI even if I'm in the foreground?
I personally like the idea of just being able to set
[MPNowPlayingInfoCenter defaultCenter].playbackState = MPNowPlayingPlaybackStatePaused;
But maybe that's because that would just make things easier on me. This does feels overcomplicated though. If anyone can share some tips on how I should handle this, I'd appreciate it.
Hi all!
I have been experiencing some issues when using the AVAudioEngine to play audio and record input while doing a voice chat (through the PTT Interface).
I noticed if I connect any players to the AudioGraph OR call start that the audio session becomes active (this is on iOS).
I don't see anything in the docs or the header files in the AVFoundation, but is it possible that calling the stop method on an engine deactivates the audio session too?
In a normal app this behavior seems logical, but when using PTT all activation and deactivation of the audio session must go through the framework and its delegate methods.
The issue I am debugging is that when the engine with the input node tapped gets stopped, and there is a gap between the input and when the server replies with inbound audio to be played and something seems to be getting the hardware/audio session into a jammed state.
Thanks for any feedback and/or confirmation on this behavior!
Hi I'm new to the forum,
I'm planning an app just for Apple watch, I would like to use bluetooth audio in background, how can I do it?
The messages I send via bluetooth stop as soon as the watch display turns off.
Thank you!
Nax
We have a Push To Talk application which allow user to record video and audio.
When user is recording a video using AVCaptureSession and receive's an Push To Talk call, from moment the Push To Talk call is received the audio in the video which is being captured is stopped while the video capture is still in progress.
Here after the PTT call is completed, we have tried restarting the audio session, there are no errors that are getting printed but we still don't see the audio getting restarted in video capture.
We have also tried to add a new input for AVCaptureSession we are receiving error that is resulting in video capture stopping, error mentioned below:
[OS-PLT] [CameraManager] Movie file finished with error: Error Domain=AVFoundationErrorDomain Code=-11818 "Recording Stopped" UserInfo={AVErrorRecordingSuccessfullyFinishedKey=true, NSLocalizedDescription=Recording Stopped, NSLocalizedRecoverySuggestion=Stop any other actions using the recording device and try again., AVErrorRecordingFailureDomainKey=1, NSUnderlyingError=0x3026bff60 {Error Domain=NSOSStatusErrorDomain Code=-16414 "(null)"}}, success
We have also raised a Feedback Ticket on same: https://feedbackassistant.apple.com/feedback/16050598
Hi all,
I have been quite stumped on this behavior for a little bit now, so thought it best to share here and see if someone more experience with AVAudioEngine / AVAudioSession can weigh in.
Right now I have a AVAudioEngine that I am using to perform some voice chat with and give buffers to play. This works perfectly until route changes start to occur, which causes the AVAudioEngine to reset itself, which then causes all players attached to this engine to be stopped.
Once a AVPlayerNode gets stopped due to this (but also any other time), all samples that were scheduled to be played then get purged. Where this becomes confusing for me is the completion handler gets called every time regardless of the sound actually being played.
Is there a reliable way to know if a sample needs to be rescheduled after a player has been reset?
I am not quite sure in my case what my observer of AVAudioEngineConfigurationChange needs to be doing, as this engine only handles output. All input is through a separate engine for simplicity.
Currently I am storing a queue of samples as they get sent to the AVPlayerNode for playback, and after that completion checking if the player isPlaying or not. If it's playing I assume that the sound actually was played- and if not then I leave it in the queue and assume that an observer on the route change or the configuration change will realize there are samples in the queue and reset them
Thanks for any feedback!
It’s been established that generally speaking background apps cannot record audio while the foreground app is already reading audio data from the microphone, but are there exceptions? For instance, is there an exception for certain Apple apps?
If so, and there’s a special exception that most programmers don’t know about but some Apple’s engineers do and perhaps some hackers do as well, wouldn’t the mechanism that allows that eventually be exploited?
I'd like to know:
Let's say there's a backgrounded app which has microphone access, such as Signal or SoundHound or Shazam. It's established that these apps are allowed to record audio in the user's environment even after being backgrounded, seemingly for as long as they want and even upload that sound data.
But can they ALSO continue recording even while another app that is in the foreground is using the microphone, such as the Phone app or Signal?
Hi,
I'd like to develop an app which runs speech recognition even after going into background. I know I can accomplish this using audio background mode and the process the audio but I am not sure if this workaround would get accepted into App Store because of the processing limitations while in the background.
How can I accomplish this while still being compliant with Apples privacy policy and other restrictions?
Thanks,
Marek
Hi all, I have spent a lot of time reading the tech note and watching the WDDC video that introduce the PTTFramework on iOS. I currently have a custom setup where I am using AVAudioEngine to schedule and play buffers that are being streamed through a call.
I am looking to use the PTTFramework to allow a user to trigger this push to talk behavior from the lock screen and the various places with the system UI it provides.
However I am unsure what the correct behavior is regarding the handling of the audio session. Right now I am using .playback when there is no active voice transmission so that devices such as AirPods can be in AD2P mode where applicable, and then transitioning to .playbackAndRecord category only when the mic input should become active. Following this change in my AVAudioEngine manager I am then manually activating and deactivating the audio session manually when the engine is either playing/recording or idle.
In the documentation it states that you should not attempt to activate or deactivate your audio session directly, but allow the framework to handle it.
Does that mean that I need to either call the request to transmit delegate function or set an active participant on the channel manager first, and then wait for the didBecomeActive delegate method to trigger before I actually attempt to play or record any audio? (I am using the fullDuplex mode currently.) I noticed that that delegate method will only trigger if the audio session wasn't active before doing one of the above (setting active participant, requesting transmit).
Lastly, when using the PTTFramework it also mentions that we get support for PTT devices and I notice on the didBeginTransmittingFrom property we have a handsfreeButton case. Is there any documentation or resources for what is actually supported out of the box for this? I am currently working on handling a lot of the push to talk through bluetooth LE, and wanted to make sure there wasn't overlap with what the system provides.
Thank you!
Hello
We have an application that play some sound via the system sound APIs from the AudioToolbox framework.
AudioServicesCreateSystemSoundID(url as CFURL, &soundID)
AudioServicesPlaySystemSoundWithCompletion(soundID)
Our make sure that an active audio session is available before playing the system sound. But when the device is connected to a BluetoothA2DP device. The sound are played on through the device speaker and not through the bluetooth A2DP device.
Our AudioSesison is configured with the following categories
[.allowBluetooth, .defaultToSpeaker, .allowBluetoothA2DP]
Sound played from the AVAudioPlayer are played on the allowBluetoothA2DP device with similar code.
Is this a bug in the AudioToolbox framework?
I noticed the following behavior with CallKit when receiving a VolP push notification:
When the app is in the foreground and a CallKit incoming call banner appears, pressing the answer button directly causes the speaker indicator in the CallKit interface to turn on. However, the audio is not actually activated (the iPhone's orange microphone indicator does not light up).
In the same foreground scenario, if I expand the CallKit banner before answering the call, the speaker indicator does not turn on, but the orange microphone indicator does light up, and audio works as expected.
When the app is in the background or not running, the incoming call banner works as expected: I can answer the call directly without expanding the banner, and the speaker does not turn on automatically. The orange microphone indicator lights up as it should.
Why is there a difference in behavior between answering directly from the banner versus expanding it first when the app is in the foreground? Is there a way to ensure consistent audio activation behavior across these scenarios?
I tried reconfiguring the audio when answering a call, but an error occurred during setActive, preventing the configuration from succeeding.
let audioSession = AVAudioSession.sharedInstance()
do {
try audioSession.setActive(false)
try audioSession.setCategory(.playAndRecord, mode: .voiceChat, options: [.defaultToSpeaker])
try audioSession.setActive(true, options: [])
} catch {
print("Failed to activate audio session: \(error)")
}
action.fulfill()
}
Error Domain=NSOSStatusErrorDomain Code=561017449 "Session activation failed" UserInfo={NSLocalizedDescription=Session activation failed}
I noticed the following behavior with CallKit when receiving a VolP push notification:
When the app is in the foreground and a CallKit incoming call banner appears, pressing the answer button directly causes the speaker indicator in the CallKit interface to turn on. However, the audio is not actually activated (the iPhone's orange microphone indicator does not light up).
In the same foreground scenario, if I expand the CallKit banner before answering the call, the speaker indicator does not turn on, but the orange microphone indicator does light up, and audio works as expected.
When the app is in the background or not running, the incoming call banner works as expected: I can answer the call directly without expanding the banner, and the speaker does not turn on automatically. The orange microphone indicator lights up as it should.
Why is there a difference in behavior between answering directly from the banner versus expanding it first when the app is in the foreground? Is there a way to ensure consistent audio activation behavior across these scenarios?
// Here addObserver for routeChangeNotification
func testAudioRoute() {
// My app is an VoIP app, so I need to set "playAndRecord" and "allowBluetooth"
try? AVAudioSession.sharedInstance().setCategory(.playAndRecord, options: [.duckOthers, .allowBluetooth, .allowBluetoothA2DP])
NotificationCenter.default.addObserver(self, selector: #selector(currentRouteChanged(noti:)), name: AVAudioSession.routeChangeNotification, object: nil)
}
// Print the "availableInputs" once got a notification
@objc func currentRouteChanged(noti: Notification) {
let availableInputs = AVAudioSession.sharedInstance().availableInputs?.compactMap({ $0.portType }) ?? []
let currentRouteInputs = AVAudioSession.sharedInstance().currentRoute.inputs.compactMap({ $0.portType })
let currentRouteOutputs = AVAudioSession.sharedInstance().currentRoute.outputs.compactMap({ $0.portType })
print("willtest: \navailableInputs=\(availableInputs), \ncurrentRouteInputs=\(currentRouteInputs), \ncurrentRouteOutputs=\(currentRouteOutputs)")
/*
When BT (Airpods pro 2) CONNECTTED: it will print like below when notification comes, this is correct.
----------------------------------------------------------
willtest:
availableInputs=[__C.AVAudioSessionPort(_rawValue: MicrophoneBuiltIn), __C.AVAudioSessionPort(_rawValue: BluetoothHFP)],
currentRouteInputs=[],
currentRouteOutputs=[__C.AVAudioSessionPort(_rawValue: BluetoothA2DPOutput)]
----------------------------------------------------------
When BT (Airpods pro 2) DISCONNECTTED: it will print like below when notification comes, this is wrong.
----------------------------------------------------------
availableInputs=[__C.AVAudioSessionPort(_rawValue: MicrophoneBuiltIn), __C.AVAudioSessionPort(_rawValue: BluetoothHFP)],
currentRouteInputs=[],
currentRouteOutputs=[__C.AVAudioSessionPort(_rawValue: Speaker)]
*/
}
So my question here is:
Why does the "availableInputs" still contain the "C.AVAudioSessionPort(_rawValue: BluetoothHFP)" item even though I have already disconnected the BT device? (Put AirPods in the case.)
BTW, if I tap the "Manual" button once I disconnected the BT, it also prints the "wrong" value for "availableInputs", and it will become normal after about 3~4 seconds.
Hi all,
I am working on an app where I have live prompts playing, in addition to a voice channel that sometimes becomes active. Right now I am using two different AVAudioSession Configurations so what we only switch to a mic enabled mode when we actually need input from the mic. These are defined below.
When just using the device hardware, everything works as expected and the modes change and the playback continues as needed. However when using bluetooth devices such as AirPods where the switch from AD2P to HFP is needed, I am getting a AVAudioEngineConfigurationChange notification. In response I am tearing down the engine and creating a new one with the same 2 player nodes. This does work fine and there are no crashes, except all the audio I have scheduled on a player node has now been cleared. All the completion blocks marked with ".dataPlayedBack" return the second this event happens, and leaves me in a state where I now have a valid engine setup again but have no idea what actually played, or was errantly marked as such.
Is this the expected behavior when getting a configuration change notification?
Adding some information below to my audio graph for context:
All my parts of the graph, I disconnect when getting this event and do the same to the new engine
private var inputEngine: AVAudioEngine
private var audioEngine: AVAudioEngine
private let voicePlayerNode: AVAudioPlayerNode
private let promptPlayerNode: AVAudioPlayerNode
audioEngine.attach(voicePlayerNode)
audioEngine.attach(promptPlayerNode)
audioEngine.connect(
voicePlayerNode,
to: audioEngine.mainMixerNode,
format: voiceNodeFormat
)
audioEngine.connect(
promptPlayerNode,
to: audioEngine.mainMixerNode,
format: nil
)
An example of how I am scheduling playback, and where that completion is firing even if it didn't actually play.
private func scheduleVoicePlayback(_ id: AudioPlaybackSample.Id, buffer: AVAudioPCMBuffer) async throws {
guard !voicePlayerQueue.samples.contains(where: { $0 == id }) else {
return
}
seprateQueue.append(buffer)
if !isVoicePlaying {
activateAudioSession()
}
voicePlayerQueue.samples.append(id)
if !voicePlayerNode.isPlaying {
voicePlayerNode.play()
}
if let convertedBuffer = buffer.convert(to: voiceNodeFormat) {
await voicePlayerNode.scheduleBuffer(convertedBuffer, completionCallbackType: .dataPlayedBack)
} else {
throw AudioPlaybackError.failedToConvert
}
voiceSampleHasBeenPlayed(id)
}
And lastly my audio session configuration if its useful.
extension AVAudioSession {
static func setDefaultCategory() {
do {
try sharedInstance().setCategory(
.playback,
options: [
.duckOthers, .interruptSpokenAudioAndMixWithOthers
]
)
} catch {
print("Failed to set default category? \(error.localizedDescription)")
}
}
static func setVoiceChatCategory() {
do {
try sharedInstance().setCategory(
.playAndRecord,
options: [
.defaultToSpeaker,
.allowBluetooth,
.allowBluetoothA2DP,
.duckOthers,
.interruptSpokenAudioAndMixWithOthers
]
)
} catch {
print("Failed to set category? \(error.localizedDescription)")
}
}
}