Dive into the technical aspects of audio on your device, including codecs, format support, and customization options.

Audio Documentation

Posts under Audio subtopic

Post

Replies

Boosts

Views

Activity

Spatial Audio on iOS 18 don't work as inteneded
I’m facing a problem while trying to achieve spatial audio effects in my iOS 18 app. I have tried several approaches to get good 3D audio, but the effect never felt good enough or it didn’t work at all. Also what mostly troubles me is I noticed that AirPods I have doesn’t recognize my app as one having spatial audio (in audio settings it shows "Spatial Audio Not Playing"). So i guess my app doesn't use spatial audio potential. First approach uses AVAudioEnviromentNode with AVAudioEngine. Chaining position of player as well as changing listener’s doesn’t seem to change anything in how audio plays. Here's simple how i initialize AVAudioEngine import Foundation import AVFoundation class AudioManager: ObservableObject { // important class variables var audioEngine: AVAudioEngine! var environmentNode: AVAudioEnvironmentNode! var playerNode: AVAudioPlayerNode! var audioFile: AVAudioFile? ... //Sound set up func setupAudio() { do { let session = AVAudioSession.sharedInstance() try session.setCategory(.playback, mode: .default, options: []) try session.setActive(true) } catch { print("Failed to configure AVAudioSession: \(error.localizedDescription)") } audioEngine = AVAudioEngine() environmentNode = AVAudioEnvironmentNode() playerNode = AVAudioPlayerNode() audioEngine.attach(environmentNode) audioEngine.attach(playerNode) audioEngine.connect(playerNode, to: environmentNode, format: nil) audioEngine.connect(environmentNode, to: audioEngine.mainMixerNode, format: nil) environmentNode.listenerPosition = AVAudio3DPoint(x: 0, y: 0, z: 0) environmentNode.listenerAngularOrientation = AVAudio3DAngularOrientation(yaw: 0, pitch: 0, roll: 0) environmentNode.distanceAttenuationParameters.referenceDistance = 1.0 environmentNode.distanceAttenuationParameters.maximumDistance = 100.0 environmentNode.distanceAttenuationParameters.rolloffFactor = 2.0 // example.mp3 is mono sound guard let audioURL = Bundle.main.url(forResource: "example", withExtension: "mp3") else { print("Audio file not found") return } do { audioFile = try AVAudioFile(forReading: audioURL) } catch { print("Failed to load audio file: \(error)") } } ... //Playing sound func playSpatialAudio(pan: Float ) { guard let audioFile = audioFile else { return } // left side playerNode.position = AVAudio3DPoint(x: pan, y: 0, z: 0) playerNode.scheduleFile(audioFile, at: nil, completionHandler: nil) do { try audioEngine.start() playerNode.play() } catch { print("Failed to start audio engine: \(error)") } ... } Second more complex approach using PHASE did better. I’ve made an exemplary app that allows players to move audio player in 3D space. I have added reverb, and sliders changing audio position up to 10 meters each direction from listener but audio seems to only really change left to right (x axis) - again I think it might be trouble with the app not being recognized as spatial. //Crucial class Variables: class PHASEAudioController: ObservableObject{ private var soundSourcePosition: simd_float4x4 = matrix_identity_float4x4 private var audioAsset: PHASESoundAsset! private let phaseEngine: PHASEEngine private let params = PHASEMixerParameters() private var soundSource: PHASESource private var phaseListener: PHASEListener! private var soundEventAsset: PHASESoundEventNodeAsset? // Initialization of PHASE init{ do { let session = AVAudioSession.sharedInstance() try session.setCategory(.playback, mode: .default, options: []) try session.setActive(true) } catch { print("Failed to configure AVAudioSession: \(error.localizedDescription)") } // Init PHASE Engine phaseEngine = PHASEEngine(updateMode: .automatic) phaseEngine.defaultReverbPreset = .mediumHall phaseEngine.outputSpatializationMode = .automatic //nothing helps // Set listener position to (0,0,0) in World space let origin: simd_float4x4 = matrix_identity_float4x4 phaseListener = PHASEListener(engine: phaseEngine) phaseListener.transform = origin phaseListener.automaticHeadTrackingFlags = .orientation try! self.phaseEngine.rootObject.addChild(self.phaseListener) do{ try self.phaseEngine.start(); } catch { print("Could not start PHASE engine") } audioAsset = loadAudioAsset() // Create sound Source // Sphere soundSourcePosition.translate(z:3.0) let sphere = MDLMesh.newEllipsoid(withRadii: vector_float3(0.1,0.1,0.1), radialSegments: 14, verticalSegments: 14, geometryType: MDLGeometryType.triangles, inwardNormals: false, hemisphere: false, allocator: nil) let shape = PHASEShape(engine: phaseEngine, mesh: sphere) soundSource = PHASESource(engine: phaseEngine, shapes: [shape]) soundSource.transform = soundSourcePosition print(soundSourcePosition) do { try phaseEngine.rootObject.addChild(soundSource) } catch { print ("Failed to add a child object to the scene.") } let simpleModel = PHASEGeometricSpreadingDistanceModelParameters() simpleModel.rolloffFactor = rolloffFactor soundPipeline.distanceModelParameters = simpleModel let samplerNode = PHASESamplerNodeDefinition( soundAssetIdentifier: audioAsset.identifier, mixerDefinition: soundPipeline, identifier: audioAsset.identifier + "_SamplerNode") samplerNode.playbackMode = .looping do {soundEventAsset = try phaseEngine.assetRegistry.registerSoundEventAsset( rootNode: samplerNode, identifier: audioAsset.identifier + "_SoundEventAsset") } catch { print("Failed to register a sound event asset.") soundEventAsset = nil } } //Playing sound func playSound(){ // Fire new sound event with currently set properties guard let soundEventAsset else { return } params.addSpatialMixerParameters( identifier: soundPipeline.identifier, source: soundSource, listener: phaseListener) let soundEvent = try! PHASESoundEvent(engine: phaseEngine, assetIdentifier: soundEventAsset.identifier, mixerParameters: params) soundEvent.start(completion: nil) } ... } Also worth mentioning might be that I only own personal team account
3
0
766
7h
Music in iOS 26.2
I’m running the iOS 26.2 Public Beta update and my album artwork is missing from the music app (I’m not using Apple Music). I use google to get my album artwork. Do I need to wait for a new update?
1
0
97
2d
Correct way for an Audio Unit v3 to return fewer than requested number of samples given a buffer
I have an AUv3 plugin which uses an FFT - which requires n samples before it can produce any output - so, depending on the relation between the host's buffer size and the FFT window size, it may receive a several buffers of samples, producing no output, and then dumping out what it has once a sufficient number of samples have been received. This means that output is produced in fits and starts, in batches that match the FFT size (modulo oversampling) - e.g. if being fed buffers of 256 samples with an fft size of 1024, the output buffer sizes will be 0 for the first 3 buffers, and upon the fourth, the first 256 processed samples are returned and the remaining 768 cached; the next three buffers will return the remaining cached samples while processing and buffering subsequent ones, and so forth. The internal mechanics of that I have solved, caching output if the current output buffer is too small, and so forth - so it all works as advertised, and the plugin reports its latency correctly. And when run as an app in demo-mode, playback works as expected. In the plugin's render block, it captures the number of frames written, and if it is less than the number of frames passed in, adjusts the mDataByteSize of the output buffers to match the actual quantity of data being returned: unsigned int framesWritten = (unsigned int) processHelper->processWithEvents(inAudioBufferList, outAudioBufferList, timestamp, frameCount, realtimeEventListHead); if (framesWritten < frameCount) { for (UInt32 i = 0; i < outAudioBufferList->mNumberBuffers; ++i) { outAudioBufferList->mBuffers[i].mDataByteSize = framesWritten * 4; // assume 4 byte floats } } However, there are a couple of serious issues: auval -v fails it with - Render Test at 64 frames, sample rate: 22050 Hz ERROR: Output Buffer Size does not match requested When connected to Logic Pro, it appears that mDataByteSize is ignored, and the entire allocated buffer is read - audio has sections of silence snipped into it which corresponds the number of empty buffers being returned If I set Logic's buffer size to 1024 and use a 1024 sample FFT window, the plugin works correctly - but of course a plugin cannot dictate buffer size, and `1024 is too small a window size to be useful for anything but filtering very high frequencies This seems like it has to be a solvable problem, and most likely the issue is in how my code reports the number of usable samples in the returned buffer. So, what is the correct way for a plugin to report that it has no samples to return, but will, uh, real soon now? I know I could convert this plugin to be one that does offline rendering of the entire input, but this is real-time processing, just with a fixed amount of latency, so that should not be necessary.
0
0
140
3d
Frequent crashes related to com.apple.coreaudio.AQClient thread
I'm encountering numerous crashes involving the com.apple.coreaudio.AQClient thread on our application. The crash details are as follows: #10 com.apple.coreaudio.AQClient SIGSEGV SEGV_ACCERR 0 libobjc.A.dylib _objc_msgSend + 44 1 AudioToolbox ClientMessageHandler::PropertyChanged(unsigned int) + 872 2 AudioToolbox ClientAudioQueue::FetchAndDeliverPendingCallbacks(unsigned int) + 924 3 AudioToolbox __XCallbackNotificationsAvailable + 212 4 libAudioToolboxUtility.dylib _mshMIGPerform + 260 5 CoreFoundation ___CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE1_PERFORM_FUNCTION__ + 56 6 CoreFoundation ___CFRunLoopDoSource1 + 596 7 CoreFoundation ___CFRunLoopRun + 2392 8 CoreFoundation _CFRunLoopRunSpecific + 572 9 AudioToolbox CADeprecated::GenericRunLoopThread::Entry(void*) + 156 10 libAudioToolboxUtility.dylib CADeprecated::CAPThread::Entry(CADeprecated::CAPThread*) + 88 11 libsystem_pthread.dylib __pthread_start + 116 All these crashes occur on system versions below iOS/iPadOS 17, primarily when the device's available RAM is low. What steps can I take to resolve this issue? Any insights would be greatly appreciated!
0
0
144
3d
watchOS longFormAudio cannot de active
My workout watch app supports audio playback during exercise sessions. When users carry both Apple Watch, iPhone, and AirPods, with AirPods connected to the iPhone, I want to route audio from Apple Watch to AirPods for playback. I've implemented this functionality using the following code. try? session.setCategory(.playback, mode: .default, policy: .longFormAudio, options: []) try await session.activate() When users are playing music on iPhone and trigger my code in the watch app, Apple Watch correctly guides users to select AirPods, pauses the iPhone's music, and plays my audio. However, when playback finishes and I end the session using the code below: try session.setActive(false, options:[.notifyOthersOnDeactivation]) the iPhone doesn't automatically resume the previously interrupted music playback—it requires manual intervention. Is this expected behavior, or am I missing other important steps in my code?
0
0
99
4d
FaceTime Screen-Share Audio and Video Experience
FaceTime’s screen-share audio balance is insanely absurd right now. Whenever I share media, the system audio that gets sent through FaceTime is a tiny whisper even at full volume (or even when connected to my speaker or headphones). The moment anyone on the call makes any noise at all, the shared audio ducks so hard it disappears, while the voice (or rustling or air conditioning noise) spikes to painful levels. It’s impossible to watch or listen to anything together. Also, the feature where FaceTime would shrink to a square during screen-sharing has been completely removed. That was a good feature and I'm really confused why it's gone. Now, the FaceTime window stays as a long rectangle that covers part of the content I'm trying to share (unless I do full screen tile, but then I can't pull up any other windows during the call) and can't be made smaller than about a third of the screen. You can't resize the window or adjust its dimensions, so it ends up blocking the actual media you're trying to watch. Here are some feature requests/fixes that would greatly improve the FaceTime screen-share experience: Option to adjust the shared media volume independently of call audio. Disable/toggle the extreme automatic audio docking while screen-sharing Reintroduce the minimized “floating square” mode or allow full manual resizing and repositioning of the FaceTime window during screen-share sessions. Overall, this setup makes FaceTime screen-sharing basically unusable. The audio balance is so inconsistent that it’s easier to switch to Zoom or Google Meet, which both handle shared sound correctly and let you move the call window out of the way. Until these issues are fixed, there’s no practical reason to use FaceTime for shared viewing at all.
1
0
157
5d
Core Audio Tap: per-device attenuation vs. number of stereo output pairs — how to get unattenuated “raw” app streams?
Hi all, I’ve implemented the new Core Audio Tap API (AudioHardwareCreateProcessTap with CATapDescription) and I’m seeing consistent level attenuation that scales with the number of stereo output pairs exposed by the target device. What I observe Device with 4 stereo pairs (8 outs) → tap shows −12.04 dB relative to source. True 2-ch devices (built-in speakers, AirPods) → ~0 dB attenuation. The attenuation appears regardless of whether I: Create a global (default-output) tap via initStereoGlobalTapButExcludeProcesses: Or create a per-process/per-device tap via initWithProcesses:andDeviceUID:withStream: Additionally, the routing choice inside the sending app matters: App output to “System/Default Output” → I often see no attenuation. App output directly to a multi-out interface (e.g., RME Fireface) → I see the pair-count-scaled attenuation. I can query Core Audio for the number of output channels/pairs and gain-compensate (+20·log10(N_pairs) dB) and that matches my measurements for many cases. However, this compensation is not universally correct because it seems to depend on where each process routes its audio (Default Output vs. direct device), even when those processes are included in the same tap aggregate. Question Is there a supported way to obtain the raw, unattenuated streams for all processes through the Tap API—i.e., to bypass this automatic headroom/attenuation behavior entirely? If this attenuation is expected by design: Is there a documented rule for when it applies (global vs. device taps, per-process taps, stream selection, etc.)? Is there a property/flag to disable it, or a reliable, official method to compute the exact compensation (beyond counting stereo pairs)? Any guidance on ensuring consistent levels when multiple processes route differently (Default Output vs. direct device) but are captured by the same tap? Environment API: AudioHardwareCreateProcessTap + CATapDescription Devices: built-in output (2-ch), RME Fireface (8+ outs / 4+ stereo pairs) Behavior reproducible with both global and per-process/per-device tap descriptions. Attenuation example: 4 stereo pairs → −12.04 dB observed. Happy to provide a minimal sample, measurements, and device logs. Thanks! — David
0
0
57
5d
SpeechTranscriber not supported
I've tried SpeechTranscriber with a lot of my devices (from iPhone 12 series ~ iPhone 17 series) without issues. However, SpeechTranscriber.isAvailable value is false for my iPhone 11 Pro. https://developer.apple.com/documentation/speech/speechtranscriber/isavailable I'am curious why the iPhone 11 Pro device is not supported. Are all iPhone 11 series not supported intentionally? Or is there any problem with my specific device? I've also checked the supportedLocales, and the value is an empty array. https://developer.apple.com/documentation/speech/speechtranscriber/supportedlocales
0
0
47
5d
SpeechAnalyzer error "asset not found after attempted download" for certain languages
I am trying to use the new SpeechAnalyzer framework in my Mac app, and am running into an issue for some languages. When I call AssetInstallationRequest.downloadAndInstall() for some languages, it throws an error: Error Domain=SFSpeechErrorDomain Code=1 "transcription.ar asset not found after attempted download." The ".ar" appears to be the language code, which in this case was Arabic. When I call AssetInventory.status(forModules:) before attempting the download, it is giving me a status of "downloading" (perhaps from an earlier attempt?). If this language was completely unsupported, I would expect it to return a status of "unsupported", so I'm not sure what's going on here. For other languages (Polish, for example) SpeechTranscriber.supportedLocale(equivalentTo:) is returning nil, so that seems like a clearly unsupported language. But I can't tell if the languages I'm trying, like Arabic, are supported and something is going wrong, or if this error represents something I can work around. Here's the relevant section of code. The error is thrown from downloadAndInstall(), so I never even get as far as setting up the SpeechAnalyzer itself. private func setUpAnalyzer() async throws { guard let sourceLanguage else { throw Error.languageNotSpecified } guard let locale = await SpeechTranscriber.supportedLocale(equivalentTo: Locale(identifier: sourceLanguage.rawValue)) else { throw Error.unsupportedLanguage } let transcriber = SpeechTranscriber(locale: locale, preset: .progressiveTranscription) self.transcriber = transcriber let reservedLocales = await AssetInventory.reservedLocales if !reservedLocales.contains(locale) && reservedLocales.count == AssetInventory.maximumReservedLocales { if let oldest = reservedLocales.last { await AssetInventory.release(reservedLocale: oldest) } } do { let status = await AssetInventory.status(forModules: [transcriber]) print("status: \(status)") if let installationRequest = try await AssetInventory.assetInstallationRequest(supporting: [transcriber]) { try await installationRequest.downloadAndInstall() } } ...
6
0
397
6d
AVAudioSessionCategoryPlayback is not allowed while CallKit call is active
We require assistance in resolving a critical audio design conflict within our Push-to-Talk (PTT) application. Our current volume amplification strategy—which relies on applying a GAIN factor to PCM samples in conjunction with setting the AVAudioSession category to Playback—is working successfully when PTT is used independently. However, upon integrating and reporting the same PTT call through the CallKit framework, this amplification effect is lost. The CallKit integration appears to be forcing a different, non-amplifying audio session category or configuration, negatively impacting the user's perceived call volume. We need guidance on how to maintain the AVAudioSessionCategoryPlayback setting, or an equivalent high-volume configuration, while operating under the control of CallKit.
3
0
197
1w
Not able to write AAC audio with 96 kHz sample rate using AVAudioRecorder or Extended audio file services
Not able to record audio in AAC format with 96 kHz sample rate using AVAudioRecorder or Extended Audio File services with 96 kHz input audio from input device. The audio recording settings used are let settings: [String: Any] = [ AVFormatIDKey: Int(kAudioFormatMPEG4AAC), AVSampleRateKey: sampleRate AVNumberOfChannelsKey: 1 AVEncoderAudioQualityKey: AVAudioQuality.high.rawValue ] When tried using AVAudioEngine using AVAudioFile, AVAudioFile(forWriting: fileURL, // file extension .m4a settings: fileSettings, commonFormat: AVAudioCommonFormat.pcmFormatFloat32, interleaved: interleaved) else { return } got error CodecConverterFactory.cpp:977 unable to select compatible encoder sample rate AudioConverter.cpp:1017 Failed to create a new in process converter -> from 1 ch, 96000 Hz, Float32 to 1 ch, 96000 Hz, aac (0x00000000) 0 bits/channel, 0 bytes/packet, 0 frames/packet, 0 bytes/frame, with status 1718449215
0
0
48
1w
[26] audioTimeRange would still be interesting for .volatileResults in SpeechTranscriber
So experimenting with the new SpeechTranscriber, if I do: let transcriber = SpeechTranscriber( locale: locale, transcriptionOptions: [], reportingOptions: [.volatileResults], attributeOptions: [.audioTimeRange] ) only the final result has audio time ranges, not the volatile results. Is this a performance consideration? If there is no performance problem, it would be nice to have the option to also get speech time ranges for volatile responses. I'm not presenting the volatile text at all in the UI, I was just trying to keep statistics about the non-speech and the speech noise level, this way I can determine when the noise level falls under the noisefloor for a while. The goal here was to finalize the recording automatically, when the noise level indicate that the user has finished speaking.
5
0
391
1w
Hosting x86 Audio Units on Silicon Mac
My app encountered problems when trying to open an x86 audioUnit v2 on a Silicon Mac (although Rosetta is installed). There seems to be a XPC connection issue with the AUHostingService that I don't know how to fix. I observed other host apps opening the same plugins without problem, so there is probably something wrong or incompatible in my codes. I noticed that: The issue occurs whether or not the app is sandboxed. The issue does no longer occur when the app itself runs under Rosetta. There is no error reported by CoreAudio during allocation and initialization of the audio unit. The first notified errors appears when the unit calls AudioUnitRender from the rendering callback. With most x86 plugins, the error is on first call: kAudioUnitErr_RenderTimeout and on any subsequent call: kAudioComponentErr_InstanceInvalidated On the UI side, when the Cocoa View is loaded, it appears shortly, then disappears immediately leaving its superview empty. With another x86 plugin, the Cocoa View is loaded normally, but CoreAudio still emits kAudioUnitErr_NoConnection from AudioUnitRender, whether the view has been loaded or not, and the plugin produces no sound. I also find these messages in the console (printed in that order): CLIENT ERROR: RemoteAUv2ViewController does not override - and thus cannot react to catastrophic errors beyond logging them AUAudioUnit_XPC.mm:641 Crashed AU possible component description: aumu/Helm/Tyte My app uses the AUv2 API and I suspect that working with the AUv3 API would spare me these problems. However, considering how my audio system is built (audio units are wrapped into C++ classes and most connections between units are managed on the fly from the rendering callback), it would be a lot of work to convert, and I’m even not sure that all I do with the AUv2 API would be possible with the AUv3 API. I could possibly find an intermediate solution, but in the immediate future I'm looking for the simplest and fastest possible fix. If I cannot find better, I see two fallback options: In this part of the doc: “Beginning with macOS 11, the system loads audio units into a separate process that depends on the architecture or host preference”, does “host preference” means that it would be possible to disable the “out of process” behavior, for example from the app entitlements or info.plist? Otherwise, as a last resort, I could completely disable the use of x86 audioUnits when my app runs under ARM64, for at least making things cleaner. But the Audio Component API doesn’t give any info about the plugin architecture, how could I found it? Any tip or idea about this issue will be much appreciated. Thanks in advance!
2
0
238
1w
AVSpeechSynthesizer pulls words out of thin air.
Hi, I'm working on a project that uses the AVSpeechSynthesizer and AVSpeechUtterance. I discovered by chance that the AVSpeechSynthesizer automatically completes some words instead of just outputting what it's supposed to. These are abbreviations for days of the week or months, but not all of them. I don't want either of them automatically completed, but only the specified text. The completion transcends languages. I have written a short example program for demonstration purposes. import SwiftUI import AVFoundation import Foundation let synthesizer: AVSpeechSynthesizer = AVSpeechSynthesizer() struct ContentView: View { var body: some View { VStack { Button { utter("mon") } label: { Text("mon") } .buttonStyle(.borderedProminent) Button { utter("tue") } label: { Text("tue") } .buttonStyle(.borderedProminent) Button { utter("thu") } label: { Text("thu") } .buttonStyle(.borderedProminent) Button { utter("feb") } label: { Text("feb") } .buttonStyle(.borderedProminent) Button { utter("feb", lang: "de-DE") } label: { Text("feb DE") } .buttonStyle(.borderedProminent) Button { utter("wed") } label: { Text("wed") } .buttonStyle(.borderedProminent) } .padding() } private func utter(_ text: String, lang: String = "en-US") { let utterance = AVSpeechUtterance(string: text) let voice = AVSpeechSynthesisVoice(language: lang) utterance.voice = voice synthesizer.speak(utterance) } } #Preview { ContentView() } Thank you Christian
1
0
166
1w
No mic capture on iOS 18.5
Hello! We stumbled upon a problem with our karaoke app where user on iPhone 16e/iOS 18.5 has problem with mic capture, other users cannot hear him. The mic capture is working fine on 17.5, 16.8. Maybe there is something else we need when configuring AVAudioSession for iOS 18.5? Currently it's set up like this: override func viewDidLoad() { super.viewDidLoad() UIApplication.shared.isIdleTimerDisabled = true mRoomId = appDelegate.getRoomId() let audioSession = AVAudioSession.sharedInstance() try! audioSession.setCategory(.playAndRecord, mode: .voiceChat, options: [.defaultToSpeaker]) try! audioSession.setPreferredSampleRate(48000) try! audioSession.setActive(true, options: []) }
1
0
227
1w
Video Audio + Speech To Text
Hello, I am wondering if it is possible to have audio from my AirPods be sent to my speech to text service and at the same time have the built in mic audio input be sent to recording a video? I ask because I want my users to be able to say "CAPTURE" and I start recording a video (with audio from the built in mic) and then when the user says "STOP" I stop the recording.
0
0
73
2w
AVAudioEngine installTap stops working after phone call interruption on iPhone 16e
Environment Device: iPhone 16e iOS Version: 18.4.1 - 18.7.1 Framework: AVFoundation (AVAudioEngine) Problem Summary On iPhone 16e (iOS 18.4.1-18.7.1), the installTap callback stops being invoked after resuming from a phone call interruption. This issue is specific to phone call interruptions and does not occur on iPhone 14, iPhone SE 3, or earlier devices. Expected Behavior After a phone call interruption ends and audioEngine.start() is called, the previously installed tap should continue receiving audio buffers. Actual Behavior After resuming from phone call interruption: Tap callback is no longer invoked No audio data is captured No errors are thrown Engine appears to be running normally Note: Normal pause/resume (without phone call interruption) works correctly. Steps to Reproduce Start audio recording on iPhone 16e Receive or make a phone call (triggers AVAudioSession interruption) End the phone call Resume recording with audioEngine.start() Result: Tap callback is not invoked Tested devices: iPhone 16e (iOS 18.4.1-18.7.1): Issue reproduces ✗ iPhone 14 (iOS 18.x): Works correctly ✓ iPhone SE 3 (iOS 18.x): Works correctly ✓ Code Initial Setup (Works) let inputNode = audioEngine.inputNode inputNode.installTap(onBus: 0, bufferSize: 4096, format: nil) { buffer, time in self.processAudioBuffer(buffer, at: time) } audioEngine.prepare() try audioEngine.start() Interruption Handling NotificationCenter.default.addObserver( forName: AVAudioSession.interruptionNotification, object: AVAudioSession.sharedInstance(), queue: nil ) { notification in guard let userInfo = notification.userInfo, let typeValue = userInfo[AVAudioSessionInterruptionTypeKey] as? UInt, let type = AVAudioSession.InterruptionType(rawValue: typeValue) else { return } if type == .began { self.audioEngine.pause() } else if type == .ended { try? self.audioSession.setActive(true) try? self.audioEngine.start() // Tap callback doesn't work after this on iPhone 16e } } Workaround Full engine restart is required on iPhone 16e: func resumeAfterInterruption() { audioEngine.stop() inputNode.removeTap(onBus: 0) inputNode.installTap(onBus: 0, bufferSize: 4096, format: nil) { buffer, time in self.processAudioBuffer(buffer, at: time) } audioEngine.prepare() try audioSession.setActive(true) try audioEngine.start() } This works but adds latency and complexity compared to simple resume. Questions Is this expected behavior on iPhone 16e? What is the recommended way to handle phone call interruptions? Why does this only affect iPhone 16e and not iPhone 14 or SE 3? Any guidance would be appreciated!
0
0
91
2w
No audio in screen recordings when using AVAudioEngine Voice Processing
Hello, We are developing a real-time speech recognition application and are utilizing AVAudioEngine with voice processing enabled on the input node. However, we have observed that enabling this mode interferes with the built-in iOS screen recording feature - specifically, the recorded video does not capture any audio when this mode is active. Since we want users to be able to record their experience within our app, this issue significantly impacts our functionality. Is there a known workaround or recommended approach to ensure that both voice processing and screen recording can function simultaneously? Any guidance would be greatly appreciated. Thank you!
2
1
328
2w