So I'm using AVAudioEngine. When playing audio I become the 'now playing' app using MPNowPlayingInfoCenter/MPRemoteCommandCenter APIs.
When configuring MPRemoteCommandCenter I add a play/pause command target via -addTargetWithHandler on the togglePlayPauseCommand property.
Now I also have a play/pause button in my app's UI. When I pause playback from my app's UI (which means I'm the active app, I'm in the foreground), what I do is this:
-I pause the AVAudioPlayerNode I'm using with AVAudioEngine.
I do not, stop, reset, etc. the AVAudioEngine. I only pause the player node. My thought process here is that the user just pressed pause and it is very likely that he will hit 'play' to resume playback in the near future because
My app is in the foreground and the user just hit the pause button.
Now if my app moves to the background and if I receive a memory warning I presume it'd make sense to tear down the engine or pause it. Perhaps I'm wrong about this?
So when I initially hit the play button from my app's UI I also activate my AVAudioSession. I do this in high priority NSOperation since the documentation warns that "we recommend that applications not activate their session from a thread where a long blocking operation will be problematic."
So now I'm playing and I hit pause from my app's UI. Then I quickly bring up the "Now Playing" center and I see I'm the "Now Playing" app but the play-pause button is showing the pause icon instead of the play icon but I'm in the pause state. I do set MPNowPlayingInfoCenter's playbackState to MPNowPlayingPlaybackStatePaused when I pause. Not surprisingly this doesn't work. The documentation states this is for macOS only.
So the only way to get MPRemoteCommandCenter to show the "play" image for the play-pause button is to deactivate my AVAudioSession when I pause playback? Since I change the active state of my audio session in a NSOperation because documentation recommends "we recommend that applications not activate their session from a thread where a long blocking operation will be problematic." the play-pause toggle in the remote command center won't immediately update since I'm doing it on another thread.
IMO it feels kind of inappropriate for a play-pause button to wait on a NSOperation activating the audio session before updating its UI when I already know my play/paused state, it should update right away like the button in my app does. Wouldn't it be nicer to just use MPNowPlayingInfoCenter's playbackState property on iOS too? If I'm no the longer the now playing app/active audio session it doesn't matter since I'm not in the now playing UI, just ignore it?
Also is it recommended that I deactivate my audio session explicitly every time the user pauses audio in my app (when I'm in the foreground)?
Also when I do deactivate the audio session I get an error: AVAudioSessionErrorCodeIsBusy (but the button in the now playing center updates to the proper image). I do this :
-(void)pause
{
[self.playerNode pause];
[self runOperationToDeactivateAudioSession];
// This does nothing on iOS:
MPNowPlayingInfoCenter *nowPlayingCenter = [MPNowPlayingInfoCenter defaultCenter];
nowPlayingCenter.playbackState = MPNowPlayingPlaybackStatePaused;
}
So in -runOperationToDeactivateAudioSession I get the AVAudioSessionErrorCodeIsBusy. According to the documentation
Starting in iOS 8, if the session has running I/Os at the time that deactivation is requested, the session will be deactivated, but the method will return NO and populate the NSError with the code property set to AVAudioSessionErrorCodeIsBusy to indicate the misuse of the API.
So pausing the player node when pausing isn't enough to meet the deactivation criteria. I guess I have to pause or stop the audio engine. I could probably wait until I receive a scene went to background notification or something before deactivating my audio session (which is async, so the button may not update to the correct image in time). This seems like a lot of code to have to write to get a play-pause toggle to update, especially in iPad-multi window scene environment.
What's the recommended approach?
Should I pause the AudioEngine instead of the player node always?
Should I always explicitly deactivate my audio session when the user pauses playback from my app's UI even if I'm in the foreground?
I personally like the idea of just being able to set
[MPNowPlayingInfoCenter defaultCenter].playbackState = MPNowPlayingPlaybackStatePaused;
But maybe that's because that would just make things easier on me. This does feels overcomplicated though. If anyone can share some tips on how I should handle this, I'd appreciate it.
AVAudioSession
RSS for tagUse the AVAudioSession object to communicate to the system how you intend to use audio in your app.
Posts under AVAudioSession tag
75 Posts
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
Hi I'm new to the forum,
I'm planning an app just for Apple watch, I would like to use bluetooth audio in background, how can I do it?
The messages I send via bluetooth stop as soon as the watch display turns off.
Thank you!
Nax
Hi all, I have spent a lot of time reading the tech note and watching the WDDC video that introduce the PTTFramework on iOS. I currently have a custom setup where I am using AVAudioEngine to schedule and play buffers that are being streamed through a call.
I am looking to use the PTTFramework to allow a user to trigger this push to talk behavior from the lock screen and the various places with the system UI it provides.
However I am unsure what the correct behavior is regarding the handling of the audio session. Right now I am using .playback when there is no active voice transmission so that devices such as AirPods can be in AD2P mode where applicable, and then transitioning to .playbackAndRecord category only when the mic input should become active. Following this change in my AVAudioEngine manager I am then manually activating and deactivating the audio session manually when the engine is either playing/recording or idle.
In the documentation it states that you should not attempt to activate or deactivate your audio session directly, but allow the framework to handle it.
Does that mean that I need to either call the request to transmit delegate function or set an active participant on the channel manager first, and then wait for the didBecomeActive delegate method to trigger before I actually attempt to play or record any audio? (I am using the fullDuplex mode currently.) I noticed that that delegate method will only trigger if the audio session wasn't active before doing one of the above (setting active participant, requesting transmit).
Lastly, when using the PTTFramework it also mentions that we get support for PTT devices and I notice on the didBeginTransmittingFrom property we have a handsfreeButton case. Is there any documentation or resources for what is actually supported out of the box for this? I am currently working on handling a lot of the push to talk through bluetooth LE, and wanted to make sure there wasn't overlap with what the system provides.
Thank you!
Hi all!
I have been experiencing some issues when using the AVAudioEngine to play audio and record input while doing a voice chat (through the PTT Interface).
I noticed if I connect any players to the AudioGraph OR call start that the audio session becomes active (this is on iOS).
I don't see anything in the docs or the header files in the AVFoundation, but is it possible that calling the stop method on an engine deactivates the audio session too?
In a normal app this behavior seems logical, but when using PTT all activation and deactivation of the audio session must go through the framework and its delegate methods.
The issue I am debugging is that when the engine with the input node tapped gets stopped, and there is a gap between the input and when the server replies with inbound audio to be played and something seems to be getting the hardware/audio session into a jammed state.
Thanks for any feedback and/or confirmation on this behavior!
We have a Push To Talk application which allow user to record video and audio.
When user is recording a video using AVCaptureSession and receive's an Push To Talk call, from moment the Push To Talk call is received the audio in the video which is being captured is stopped while the video capture is still in progress.
Here after the PTT call is completed, we have tried restarting the audio session, there are no errors that are getting printed but we still don't see the audio getting restarted in video capture.
We have also tried to add a new input for AVCaptureSession we are receiving error that is resulting in video capture stopping, error mentioned below:
[OS-PLT] [CameraManager] Movie file finished with error: Error Domain=AVFoundationErrorDomain Code=-11818 "Recording Stopped" UserInfo={AVErrorRecordingSuccessfullyFinishedKey=true, NSLocalizedDescription=Recording Stopped, NSLocalizedRecoverySuggestion=Stop any other actions using the recording device and try again., AVErrorRecordingFailureDomainKey=1, NSUnderlyingError=0x3026bff60 {Error Domain=NSOSStatusErrorDomain Code=-16414 "(null)"}}, success
We have also raised a Feedback Ticket on same: https://feedbackassistant.apple.com/feedback/16050598
It’s been established that generally speaking background apps cannot record audio while the foreground app is already reading audio data from the microphone, but are there exceptions? For instance, is there an exception for certain Apple apps?
If so, and there’s a special exception that most programmers don’t know about but some Apple’s engineers do and perhaps some hackers do as well, wouldn’t the mechanism that allows that eventually be exploited?
Topic:
Privacy & Security
SubTopic:
General
Tags:
AudioToolbox
Audio
AVAudioSession
Background Tasks
I'd like to know:
Let's say there's a backgrounded app which has microphone access, such as Signal or SoundHound or Shazam. It's established that these apps are allowed to record audio in the user's environment even after being backgrounded, seemingly for as long as they want and even upload that sound data.
But can they ALSO continue recording even while another app that is in the foreground is using the microphone, such as the Phone app or Signal?
Topic:
Privacy & Security
SubTopic:
General
Tags:
AudioToolbox
AudioUnit
Core Audio Kit
AVAudioSession
Description
As of iOS 18, AVAudioSession.setPreferredIOBufferDuration ignores the requested buffer size when Sound Recognition or Vocal Shortcuts is enabled. This results in 1) much larger buffer sizes and 2) mismatched buffer sizes between input and output buffers, which causes ‘glitchy’ audio and increased latency.
Additionally, when this issue occurs AVAudioSession.setPreferredIOBufferDuration continues to return ‘true’ and no error is produced.
Steps to Reproduce:
Enable Vocal Shortcuts on a device running iOS 18. Enable at least one shortcut (e.g. Control Center).
Open or clone the example project (https://github.com/cwalo/SoundRecognitionBug)
Build and install the example project
Attach a headset and launch the application
Observe console logs showing
a requested buffer size of 0.005805 (256 samples @ 48k)
an actual buffer size of 0.023220 (1104 samples @48k - this is regularly the resulting buffer size in all of our tests)
Quit the app and detach the headset. Enable mutesOutput in AudioSystem.mm (to avoid feedback)
Launch the application
Observe
Same result from step 4
Mismatched hardware buffer size of 1104 and recorded frame count of 1024
Mismatched playbackCount and recordCount
Quit the app and disable vocal shortcuts
Launch the app
Observe IOBufferDuration matching the requested duration and matched buffer sizes (expected behavior)
Expected results:
Requested IOBufferDuration is respected or AVAudioSession returns false or error is produced
Input and output buffer sizes match
Device(s): iPhone 11 Pro, iPad Pro
OS: iOS 18.0.1
Environment: Xcode 16.1
FB: FB15715421
Related to: https://forums.developer.apple.com/forums/thread/765477
I'm using AVAudioEngine to play AVAudioPCMBuffers. I'd like to synchronize some events with the playback. For example if the audio's frame position is >= some point && less than some point trigger some code.
So I'm looking at - (void)installTapOnBus:(AVAudioNodeBus)bus bufferSize:(AVAudioFrameCount)bufferSize format:(AVAudioFormat * __nullable)format block:(AVAudioNodeTapBlock)tapBlock;
Now I have frame positions calculated (predetermined before audio is scheduled I already made all necessary computations) . So I just need to fire code at certain points during playback:
[playerNode installTapOnBus:bus
bufferSize:bufferSize
format:format
block:^(AVAudioPCMBuffer * _Nonnull buffer, AVAudioTime * _Nonnull when) {
//Inspect current audio here and fire...
}];
[playerNode scheduleBuffer:fullbuffer
atTime:startTime
options:0
completionCallbackType:AVAudioPlayerNodeCompletionDataPlayedBack
completionHandler:^(AVAudioPlayerNodeCompletionCallbackType callbackType)
{
// some code is here, not important to this question.
}];
The problem I'm having is figuring out at what point in full buffer I'm at within the tap block. The tap block passes chunks (not the full audio buffer). I tried using the when parameter of the block to calculate the frame position relative to the entire audio but have be unsuccessful so far. I'm assuming the when parameter is relative to the buffer passed in the tap block (not my entire audio buffer I scheduled).
Not installing a tap and just using a timer before scheduling my fullBuffer has given me good results but I'd rather avoid using a timer if possible and use sample time.
Topic:
Media Technologies
SubTopic:
Audio
Tags:
AVAudioNode
AVAudioSession
AVAudioEngine
AVFoundation
Hi,
I'd like to develop an app which runs speech recognition even after going into background. I know I can accomplish this using audio background mode and the process the audio but I am not sure if this workaround would get accepted into App Store because of the processing limitations while in the background.
How can I accomplish this while still being compliant with Apples privacy policy and other restrictions?
Thanks,
Marek
// Here addObserver for routeChangeNotification
func testAudioRoute() {
// My app is an VoIP app, so I need to set "playAndRecord" and "allowBluetooth"
try? AVAudioSession.sharedInstance().setCategory(.playAndRecord, options: [.duckOthers, .allowBluetooth, .allowBluetoothA2DP])
NotificationCenter.default.addObserver(self, selector: #selector(currentRouteChanged(noti:)), name: AVAudioSession.routeChangeNotification, object: nil)
}
// Print the "availableInputs" once got a notification
@objc func currentRouteChanged(noti: Notification) {
let availableInputs = AVAudioSession.sharedInstance().availableInputs?.compactMap({ $0.portType }) ?? []
let currentRouteInputs = AVAudioSession.sharedInstance().currentRoute.inputs.compactMap({ $0.portType })
let currentRouteOutputs = AVAudioSession.sharedInstance().currentRoute.outputs.compactMap({ $0.portType })
print("willtest: \navailableInputs=\(availableInputs), \ncurrentRouteInputs=\(currentRouteInputs), \ncurrentRouteOutputs=\(currentRouteOutputs)")
/*
When BT (Airpods pro 2) CONNECTTED: it will print like below when notification comes, this is correct.
----------------------------------------------------------
willtest:
availableInputs=[__C.AVAudioSessionPort(_rawValue: MicrophoneBuiltIn), __C.AVAudioSessionPort(_rawValue: BluetoothHFP)],
currentRouteInputs=[],
currentRouteOutputs=[__C.AVAudioSessionPort(_rawValue: BluetoothA2DPOutput)]
----------------------------------------------------------
When BT (Airpods pro 2) DISCONNECTTED: it will print like below when notification comes, this is wrong.
----------------------------------------------------------
availableInputs=[__C.AVAudioSessionPort(_rawValue: MicrophoneBuiltIn), __C.AVAudioSessionPort(_rawValue: BluetoothHFP)],
currentRouteInputs=[],
currentRouteOutputs=[__C.AVAudioSessionPort(_rawValue: Speaker)]
*/
}
So my question here is:
Why does the "availableInputs" still contain the "C.AVAudioSessionPort(_rawValue: BluetoothHFP)" item even though I have already disconnected the BT device? (Put AirPods in the case.)
BTW, if I tap the "Manual" button once I disconnected the BT, it also prints the "wrong" value for "availableInputs", and it will become normal after about 3~4 seconds.
Hi all,
I am working on an app where I have live prompts playing, in addition to a voice channel that sometimes becomes active. Right now I am using two different AVAudioSession Configurations so what we only switch to a mic enabled mode when we actually need input from the mic. These are defined below.
When just using the device hardware, everything works as expected and the modes change and the playback continues as needed. However when using bluetooth devices such as AirPods where the switch from AD2P to HFP is needed, I am getting a AVAudioEngineConfigurationChange notification. In response I am tearing down the engine and creating a new one with the same 2 player nodes. This does work fine and there are no crashes, except all the audio I have scheduled on a player node has now been cleared. All the completion blocks marked with ".dataPlayedBack" return the second this event happens, and leaves me in a state where I now have a valid engine setup again but have no idea what actually played, or was errantly marked as such.
Is this the expected behavior when getting a configuration change notification?
Adding some information below to my audio graph for context:
All my parts of the graph, I disconnect when getting this event and do the same to the new engine
private var inputEngine: AVAudioEngine
private var audioEngine: AVAudioEngine
private let voicePlayerNode: AVAudioPlayerNode
private let promptPlayerNode: AVAudioPlayerNode
audioEngine.attach(voicePlayerNode)
audioEngine.attach(promptPlayerNode)
audioEngine.connect(
voicePlayerNode,
to: audioEngine.mainMixerNode,
format: voiceNodeFormat
)
audioEngine.connect(
promptPlayerNode,
to: audioEngine.mainMixerNode,
format: nil
)
An example of how I am scheduling playback, and where that completion is firing even if it didn't actually play.
private func scheduleVoicePlayback(_ id: AudioPlaybackSample.Id, buffer: AVAudioPCMBuffer) async throws {
guard !voicePlayerQueue.samples.contains(where: { $0 == id }) else {
return
}
seprateQueue.append(buffer)
if !isVoicePlaying {
activateAudioSession()
}
voicePlayerQueue.samples.append(id)
if !voicePlayerNode.isPlaying {
voicePlayerNode.play()
}
if let convertedBuffer = buffer.convert(to: voiceNodeFormat) {
await voicePlayerNode.scheduleBuffer(convertedBuffer, completionCallbackType: .dataPlayedBack)
} else {
throw AudioPlaybackError.failedToConvert
}
voiceSampleHasBeenPlayed(id)
}
And lastly my audio session configuration if its useful.
extension AVAudioSession {
static func setDefaultCategory() {
do {
try sharedInstance().setCategory(
.playback,
options: [
.duckOthers, .interruptSpokenAudioAndMixWithOthers
]
)
} catch {
print("Failed to set default category? \(error.localizedDescription)")
}
}
static func setVoiceChatCategory() {
do {
try sharedInstance().setCategory(
.playAndRecord,
options: [
.defaultToSpeaker,
.allowBluetooth,
.allowBluetoothA2DP,
.duckOthers,
.interruptSpokenAudioAndMixWithOthers
]
)
} catch {
print("Failed to set category? \(error.localizedDescription)")
}
}
}
Hello
We have an application that play some sound via the system sound APIs from the AudioToolbox framework.
AudioServicesCreateSystemSoundID(url as CFURL, &soundID)
AudioServicesPlaySystemSoundWithCompletion(soundID)
Our make sure that an active audio session is available before playing the system sound. But when the device is connected to a BluetoothA2DP device. The sound are played on through the device speaker and not through the bluetooth A2DP device.
Our AudioSesison is configured with the following categories
[.allowBluetooth, .defaultToSpeaker, .allowBluetoothA2DP]
Sound played from the AVAudioPlayer are played on the allowBluetoothA2DP device with similar code.
Is this a bug in the AudioToolbox framework?
I noticed the following behavior with CallKit when receiving a VolP push notification:
When the app is in the foreground and a CallKit incoming call banner appears, pressing the answer button directly causes the speaker indicator in the CallKit interface to turn on. However, the audio is not actually activated (the iPhone's orange microphone indicator does not light up).
In the same foreground scenario, if I expand the CallKit banner before answering the call, the speaker indicator does not turn on, but the orange microphone indicator does light up, and audio works as expected.
When the app is in the background or not running, the incoming call banner works as expected: I can answer the call directly without expanding the banner, and the speaker does not turn on automatically. The orange microphone indicator lights up as it should.
Why is there a difference in behavior between answering directly from the banner versus expanding it first when the app is in the foreground? Is there a way to ensure consistent audio activation behavior across these scenarios?
I tried reconfiguring the audio when answering a call, but an error occurred during setActive, preventing the configuration from succeeding.
let audioSession = AVAudioSession.sharedInstance()
do {
try audioSession.setActive(false)
try audioSession.setCategory(.playAndRecord, mode: .voiceChat, options: [.defaultToSpeaker])
try audioSession.setActive(true, options: [])
} catch {
print("Failed to activate audio session: \(error)")
}
action.fulfill()
}
Error Domain=NSOSStatusErrorDomain Code=561017449 "Session activation failed" UserInfo={NSLocalizedDescription=Session activation failed}
I noticed the following behavior with CallKit when receiving a VolP push notification:
When the app is in the foreground and a CallKit incoming call banner appears, pressing the answer button directly causes the speaker indicator in the CallKit interface to turn on. However, the audio is not actually activated (the iPhone's orange microphone indicator does not light up).
In the same foreground scenario, if I expand the CallKit banner before answering the call, the speaker indicator does not turn on, but the orange microphone indicator does light up, and audio works as expected.
When the app is in the background or not running, the incoming call banner works as expected: I can answer the call directly without expanding the banner, and the speaker does not turn on automatically. The orange microphone indicator lights up as it should.
Why is there a difference in behavior between answering directly from the banner versus expanding it first when the app is in the foreground? Is there a way to ensure consistent audio activation behavior across these scenarios?