I am trying to get access to raw audio samples from mic. I've written a simple example application that writes the values to a text file.
Below is my sample application. All the input samples from the buffers connected to the input tap is zero. What am I doing wrong?
I did add the Privacy - Microphone Usage Description key to my application target properties and I am allowing microphone access when the application launches. I do find it strange that I have to provide permission every time even though in Settings > Privacy, my application is listed as one of the applications allowed to access the microphone.
class AudioRecorder {
private let audioEngine = AVAudioEngine()
private var fileHandle: FileHandle?
func startRecording() {
let inputNode = audioEngine.inputNode
let audioFormat: AVAudioFormat
#if os(iOS)
let hardwareSampleRate = AVAudioSession.sharedInstance().sampleRate
audioFormat = AVAudioFormat(standardFormatWithSampleRate: hardwareSampleRate, channels: 1)!
#elseif os(macOS)
audioFormat = inputNode.inputFormat(forBus: 0) // Use input node's current format
inputNode.installTap(onBus: 0, bufferSize: 1024, format: audioFormat) { [weak self] buffer, _ in
self!.processAudioBuffer(buffer: buffer)
do {
try audioEngine.start()
print("Recording started with format: \(audioFormat)")
} catch {
print("Failed to start audio engine: \(error.localizedDescription)")
func stopRecording() {
audioEngine.inputNode.removeTap(onBus: 0)
print("Recording stopped.")
private func setupTextFile() {
let tempDir = FileManager.default.temporaryDirectory
let textFileURL = tempDir.appendingPathComponent("audioData.txt")
FileManager.default.createFile(atPath: textFileURL.path, contents: nil, attributes: nil)
fileHandle = try? FileHandle(forWritingTo: textFileURL)
private func processAudioBuffer(buffer: AVAudioPCMBuffer) {
guard let channelData = buffer.floatChannelData else { return }
let channelSamples = channelData[0]
let frameLength = Int(buffer.frameLength)
var textData = ""
var allZero = true
for i in 0..<frameLength {
let sample = channelSamples[i]
if sample != 0 {
allZero = false
textData += "\(sample)\n"
if allZero {
print("Got \(frameLength) worth of audio data on \(buffer.stride) channels. All data is zero.")
} else {
print("Got \(frameLength) worth of audio data on \(buffer.stride) channels.")
// Write to file
if let data = textData.data(using: .utf8) {
RSS for tagIntegrate music and other audio content into your apps.
Posts under Audio tag
87 Posts
Sort by:
It’s been established that generally speaking background apps cannot record audio while the foreground app is already reading audio data from the microphone, but are there exceptions? For instance, is there an exception for certain Apple apps?
If so, and there’s a special exception that most programmers don’t know about but some Apple’s engineers do and perhaps some hackers do as well, wouldn’t the mechanism that allows that eventually be exploited?
In our app we have implemented a AVAssetResourceLoaderDelegate to handle encrypted downloaded files. We have it working on all iOS versions but we are seeing issues on iOS 15 (15.8.3) with large files (> 1 Gb). We have so far seen two cases where either the load method on the AVURLAsset fails early and throws an unknown error code or starts requesting more data than the device has available RAM. The CPU usage is almost always over 100%, even after pausing playback. The memory issue can happen even though the player has successfully started playback.
When running this on devices running iOS 16 and above we set the isEntireLengthAvailableOnDemand to true on the AVAssetResourceLoadingContentInformationRequest. This seems to be key to solving the issue those devices that support it. If we set the property to false we see the same memory issue as on iOS 15.
So we have a solution for iOS 16 and upwards but are at a loss for how to handle iOS 15. Is there something we have overlooked or is it in fact an issue with that iOS version?
Hi all, I have spent a lot of time reading the tech note and watching the WDDC video that introduce the PTTFramework on iOS. I currently have a custom setup where I am using AVAudioEngine to schedule and play buffers that are being streamed through a call.
I am looking to use the PTTFramework to allow a user to trigger this push to talk behavior from the lock screen and the various places with the system UI it provides.
However I am unsure what the correct behavior is regarding the handling of the audio session. Right now I am using .playback when there is no active voice transmission so that devices such as AirPods can be in AD2P mode where applicable, and then transitioning to .playbackAndRecord category only when the mic input should become active. Following this change in my AVAudioEngine manager I am then manually activating and deactivating the audio session manually when the engine is either playing/recording or idle.
In the documentation it states that you should not attempt to activate or deactivate your audio session directly, but allow the framework to handle it.
Does that mean that I need to either call the request to transmit delegate function or set an active participant on the channel manager first, and then wait for the didBecomeActive delegate method to trigger before I actually attempt to play or record any audio? (I am using the fullDuplex mode currently.) I noticed that that delegate method will only trigger if the audio session wasn't active before doing one of the above (setting active participant, requesting transmit).
Lastly, when using the PTTFramework it also mentions that we get support for PTT devices and I notice on the didBeginTransmittingFrom property we have a handsfreeButton case. Is there any documentation or resources for what is actually supported out of the box for this? I am currently working on handling a lot of the push to talk through bluetooth LE, and wanted to make sure there wasn't overlap with what the system provides.
Thank you!
I use a AVplayer in a window view, I found that when I move the window to different positions, the default behavior is that the sound will change according to the window position. However, in some cases, I don't need this default behavior. I hope the sound doesn't change.
we are using angular and Html5 to develop our application, in our application we play videos that are placed on s3. Video when played on desktop borwser are adequatley audible but when played on iPad their volume is too low to be audible. I have tried
video.volume =1 but it does not work for iPad because this property is only readable for ios devices.
I have tried using javascript audioContext. It worked for my local machine. But when code is deployed on some hosted environments, it just does not work.
Did anyone face the same issue? Any help regarding it will be appreciated.
In the downloadable WWDC sample project "CreatingASpaceshipGame" there is an audio file named "WorkMusic.aiff", as well mentioned in the video. Info says it's PCM 4-channel Quadrophonic.
Where can I find further information on how this file was authored? Was it simply exported from Logic Pro with Quadrophonic Surround settings or did it have any other specific treatment?
Hi all,
I am working on an app where I have live prompts playing, in addition to a voice channel that sometimes becomes active. Right now I am using two different AVAudioSession Configurations so what we only switch to a mic enabled mode when we actually need input from the mic. These are defined below.
When just using the device hardware, everything works as expected and the modes change and the playback continues as needed. However when using bluetooth devices such as AirPods where the switch from AD2P to HFP is needed, I am getting a AVAudioEngineConfigurationChange notification. In response I am tearing down the engine and creating a new one with the same 2 player nodes. This does work fine and there are no crashes, except all the audio I have scheduled on a player node has now been cleared. All the completion blocks marked with ".dataPlayedBack" return the second this event happens, and leaves me in a state where I now have a valid engine setup again but have no idea what actually played, or was errantly marked as such.
Is this the expected behavior when getting a configuration change notification?
Adding some information below to my audio graph for context:
All my parts of the graph, I disconnect when getting this event and do the same to the new engine
private var inputEngine: AVAudioEngine
private var audioEngine: AVAudioEngine
private let voicePlayerNode: AVAudioPlayerNode
private let promptPlayerNode: AVAudioPlayerNode
to: audioEngine.mainMixerNode,
format: voiceNodeFormat
to: audioEngine.mainMixerNode,
format: nil
An example of how I am scheduling playback, and where that completion is firing even if it didn't actually play.
private func scheduleVoicePlayback(_ id: AudioPlaybackSample.Id, buffer: AVAudioPCMBuffer) async throws {
guard !voicePlayerQueue.samples.contains(where: { $0 == id }) else {
if !isVoicePlaying {
if !voicePlayerNode.isPlaying {
if let convertedBuffer = buffer.convert(to: voiceNodeFormat) {
await voicePlayerNode.scheduleBuffer(convertedBuffer, completionCallbackType: .dataPlayedBack)
} else {
throw AudioPlaybackError.failedToConvert
And lastly my audio session configuration if its useful.
extension AVAudioSession {
static func setDefaultCategory() {
do {
try sharedInstance().setCategory(
options: [
.duckOthers, .interruptSpokenAudioAndMixWithOthers
} catch {
print("Failed to set default category? \(error.localizedDescription)")
static func setVoiceChatCategory() {
do {
try sharedInstance().setCategory(
options: [
} catch {
print("Failed to set category? \(error.localizedDescription)")
I’m looking to add DAW-like capabilities to my macOS music app, and AVAudioEngine seems like the right tool for the job.
However, I haven’t been able to find any documentation on how to save the user’s AVAudioEngine configuration—specifically the connections between nodes and the internal states of each node—to a file.
Does AVAudioEngine provide any API for saving and restoring this state, or does it need to be handled manually? If it’s manual, are there any sample "DAW" apps or resources that demonstrate how this can be implemented?
Any guidance would be greatly appreciated.
We have special use case, We have two apps, App A (Electron) and App B (Swift). App B when run independently works completely fine but when bundles with App A and shipped as dmg, App B doesn't prompt for microphone permission anymore. What can be issue? What's right way to ship both app together such that App B is hidden and launched through App A only? How can I figure out what changes after App B is bundled and comes with App A. Even if I produce dmg of App A and install it on same system, App B doesn't ask for microphone permission anymore.
As of iOS 18, AVAudioSession.setPreferredIOBufferDuration ignores the requested buffer size when Sound Recognition or Vocal Shortcuts is enabled. This results in 1) much larger buffer sizes and 2) mismatched buffer sizes between input and output buffers, which causes ‘glitchy’ audio and increased latency.
Additionally, when this issue occurs AVAudioSession.setPreferredIOBufferDuration continues to return ‘true’ and no error is produced.
Steps to Reproduce:
Enable Vocal Shortcuts on a device running iOS 18. Enable at least one shortcut (e.g. Control Center).
Open or clone the example project (https://github.com/cwalo/SoundRecognitionBug)
Build and install the example project
Attach a headset and launch the application
Observe console logs showing
a requested buffer size of 0.005805 (256 samples @ 48k)
an actual buffer size of 0.023220 (1104 samples @48k - this is regularly the resulting buffer size in all of our tests)
Quit the app and detach the headset. Enable mutesOutput in AudioSystem.mm (to avoid feedback)
Launch the application
Same result from step 4
Mismatched hardware buffer size of 1104 and recorded frame count of 1024
Mismatched playbackCount and recordCount
Quit the app and disable vocal shortcuts
Launch the app
Observe IOBufferDuration matching the requested duration and matched buffer sizes (expected behavior)
Expected results:
Requested IOBufferDuration is respected or AVAudioSession returns false or error is produced
Input and output buffer sizes match
Device(s): iPhone 11 Pro, iPad Pro
OS: iOS 18.0.1
Environment: Xcode 16.1
FB: FB15715421
Related to: https://forums.developer.apple.com/forums/thread/765477
I’m working on a memo app that records audio from the iPhone’s microphone (and other devices like MacBook or iPad) and processes it in 10-second chunks at a target sample rate of 16 kHz. However, I’ve encountered limitations with installTap in AVAudioEngine, which doesn’t natively support configuring a target sample rate on the mic input (the default being 44.1 kHz).
To address this, I tried using AVAudioMixerNode to downsample the mic input directly. Although everything seems correctly configured, no audio is recorded—just a flat signal with zero levels. There are no errors, and all permissions are granted, so it seems like an issue with downsampling rather than the mic setup itself.
To make progress, I implemented a workaround by tapping and resampling each chunk tapped using installTap (every 50ms in my case) with AVAudioConverter. While this works, it can introduce artifacts at the beginning and end of each chunk, likely due to separate processing instead of continuous downsampling.
Here are the key issues and questions I have:
1. Can we change the mic input sample rate directly using AVAudioSession or another native API in AVAudio? Setting up the desired sample rate initially would be ideal for my use case.
2. Are there alternatives to installTap for recording audio at a different sample rate or for continuously downsampling the live input without chunk-based artifacts?
This issue seems longstanding, as noted in a 2018 forum post:
Any guidance on configuring or processing mic input at a lower sample rate in real-time would be greatly appreciated. Thank you!
Hi, I'm facing an issuer with audio worklet in safari. This issue is clearly an iOS bug (it doesn't occur on iPad or Mac)
Here's the minimal reproduction:
Go to https://googlechromelabs.github.io/web-audio-samples/audio-worklet/basic/hello-audio-worklet/
Press start
Audio will not be playing
Open YouTube on another tab and start any video
Audio from the worklet will start playing
Is this a known issue? Any plans to address that? Any workaround available?
Has anyone else noticed that there are two different volume levels now ?
When I call for an advisor, the IVR (AI introduction sounds fine.
When the advisor or any other call speaks, i'ts so low I have to put it to my ear.
Hello, I've been working to implement PTT in the way recommended by the documentation. The main issue is that the bluetooth methods are opaque, so I cannot solve for what I need. The result will be that I will have to resort to hacky approaches that the PTT framework seems to be intended to solve (playing silent clips, playing custom notification sounds, having long running background audio sessions).
I am testing with Anker soundcore mini as well as airpod pro.
Here's the issue: there are 2 very different behaviours depending on whether I'm using a call/fullDuplex session and a halfDuplex session.
Anchor mini
Current behaviour
long press activates siri
pressing again after siri is active, starts transmission
long press activates siri again
pressing again after siri is active, stops transmission
pause/play routes to the ongoing media session and plays music
Expected behaviour
play/pause should map to transmit/stopTransmit
IF I have to use long press, it should at least not trigger siri
AirPod pro
Current behaviour
long press changes noise cancellation
pause/play routes to the ongoing media session and plays music
Expected behaviour
play/pause should map to transmit/stopTransmit
Anchor mini:
Current behaviour
long press activates siri
pressing again after siri is active, starts transmission
long press activates siri again
pressing again after siri is active, stops transmission
pause/play routes to the ongoing media session and plays music
Expected behaviour
play/pause should map to transmit/stopTransmit
IF I have to use long press, it should at least not trigger siri
AirPod pro
Current behaviour
long press changes noise cancellation
pause/play maps to mute/unmute (even if media is playing)
Expected behaviour
This makes sense for call behaviour, I wish it worked this well for PTT
The intention here is to be able to fully interact with a channel hands-free. The current API seems to make that impossible. Is that by design? Reading all the docs seems to suggest its intended for transmit/stopTransmit to be doable just with the play/pause buttons, but even apple hardware seems to not support that.
I am fairly new to working with AVFoundation etc. As far as I could research on my own, if I want to get metadata from let's say a .m4a audio file, I have to get the data and then create an AVAsset. My files are all on local servers and therefore I would not be able to just pass in the URL.
The extraction of the metadata works fine - however those AVAssets create a huge overhead in storage consumption. To my knowledge the data instances of each audio file and AVAsset should only live inside the function I call to extract the metadata, however those data/AVAsset instances still live on on storage as I can clearly see that the app's file size increases by multiple Gigabytes (equal to the library size I test with). However, the only data that I purposefully save with SwiftData is the album artwork.
Is this normal behavior for AVAssets or am I missing some detail?
PS. If I forgot to mention something important, please ask. This is my first ever post, so I'm not too sure what is worth mentioning.
Thank you in advance!
I am setting up macMinis as CI machines (using gitlab-runner) for my team. We are developing mostly audio stuff, and some of our unit tests imply using audio inputs with AVAudioSession/AVAudioEngine.
These CI jobs trigger a microphone authorization pop-up on the macMinis, asking for permission to give gitlab-runner access to the microphone. Once the authorization is given, subsequent jobs run fine.
My issue is that the macMinis are updated on a regular basis with scripts, and since the path of the gitlab-runner binary, installed with homebrew, changes on every version, the pop-up is triggered again every time gitlab-runner gets updated.
Since we are having more and more CI runners, maintaining this manually is becoming impossible.
Is there a way to either deactivate this security or scripting the authorization for a binary to access the microphone?
Thank you for your help!
We are trying to get audio volume from microphone.
We have 2 questions.
1. Can anyone tell me about AVAudioEngine.InputNode.volume?
Return 0 in the silence, Return float type value within 1.0 depending on the
volume are expected work, but it looks 1.0 (default value) is returned at any time.
Which case does it return 0.5 or 0?
Sample code is below. Microphone works correctly.
// instance member
private var engine: AVAudioEngine!
private var node: AVAudioInputNode!
// start method
self.engine = .init()
self.node = engine.inputNode
try! engine.start()
// volume getter
2. What is the best practice to get audio volume from microphone?
Requirements are:
Without AVAudioRecorder. We use it for streaming audio.
it should withstand high frequency access.
Testing info
device: iPhone XR
OS version: iOS 18
Best Regards.
The new iPhone 16 supports spatial audio recordings in the camera app when recording videos. Is it possible to also record spatial audio without video, and is it possible for 3rd party developers to do so? If so, how do I need to configure AVAudioSession and/or AVAudioEngine to record spatial audio in my audio recording app on iPhone 16?
Technical Issue Report for Maple Tale App - Audio Format Compatibility
Dear Apple Technical Support Team,
I hope this message finds you well. My name is [Your Name], and I am part of the development team behind the Maple Tale app. We have encountered an issue with audio format compatibility within our app that we believe requires your assistance.
The issue pertains to the audio formats supported by our app. Currently, our app only supports WAV and OGG formats, which has led to a limitation in user experience. We are looking to expand our support to include additional formats such as MP3 and AAC, which are widely used by our user base.
To provide a clear understanding of the issue, I have outlined the steps to reproduce the problem:
Launch the Maple Tale app.
Proceed with the game normally.
Upon picking up equipment within the game, a warning box pops up indicating the audio format compatibility issue.
This warning box appears due to the app's inability to process audio files in formats other than WAV and OGG. We understand that this can be a significant hindrance to the user experience, and we are eager to resolve this as quickly as possible.
We have reviewed the documentation available on the official Apple Developer website but are still seeking clarification on the best practices for supporting a wider range of audio formats within our app. We would greatly appreciate any official recommendations or guidelines that could assist us in this endeavor.
Additionally, we are considering updating our app to inform users about the current audio format requirements and provide guidance on how to optimize their audio files for the best performance within our app. If there are any official documents or resources that we should reference when crafting this update, please let us know.
We appreciate your time and assistance in this matter and look forward to your guidance on how to best implement audio format support on the iOS platform.
Thank you once again for your support.
Warm regards,