Dive into the technical aspects of audio on your device, including codecs, format support, and customization options.

Audio Documentation

Posts under Audio subtopic

Post

Replies

Boosts

Views

Activity

watchOS longFormAudio cannot de active
My workout watch app supports audio playback during exercise sessions. When users carry both Apple Watch, iPhone, and AirPods, with AirPods connected to the iPhone, I want to route audio from Apple Watch to AirPods for playback. I've implemented this functionality using the following code. try? session.setCategory(.playback, mode: .default, policy: .longFormAudio, options: []) try await session.activate() When users are playing music on iPhone and trigger my code in the watch app, Apple Watch correctly guides users to select AirPods, pauses the iPhone's music, and plays my audio. However, when playback finishes and I end the session using the code below: try session.setActive(false, options:[.notifyOthersOnDeactivation]) the iPhone doesn't automatically resume the previously interrupted music playback—it requires manual intervention. Is this expected behavior, or am I missing other important steps in my code?
1
0
350
Nov ’25
Spatial Audio on iOS 18 don't work as inteneded
I’m facing a problem while trying to achieve spatial audio effects in my iOS 18 app. I have tried several approaches to get good 3D audio, but the effect never felt good enough or it didn’t work at all. Also what mostly troubles me is I noticed that AirPods I have doesn’t recognize my app as one having spatial audio (in audio settings it shows "Spatial Audio Not Playing"). So i guess my app doesn't use spatial audio potential. First approach uses AVAudioEnviromentNode with AVAudioEngine. Chaining position of player as well as changing listener’s doesn’t seem to change anything in how audio plays. Here's simple how i initialize AVAudioEngine import Foundation import AVFoundation class AudioManager: ObservableObject { // important class variables var audioEngine: AVAudioEngine! var environmentNode: AVAudioEnvironmentNode! var playerNode: AVAudioPlayerNode! var audioFile: AVAudioFile? ... //Sound set up func setupAudio() { do { let session = AVAudioSession.sharedInstance() try session.setCategory(.playback, mode: .default, options: []) try session.setActive(true) } catch { print("Failed to configure AVAudioSession: \(error.localizedDescription)") } audioEngine = AVAudioEngine() environmentNode = AVAudioEnvironmentNode() playerNode = AVAudioPlayerNode() audioEngine.attach(environmentNode) audioEngine.attach(playerNode) audioEngine.connect(playerNode, to: environmentNode, format: nil) audioEngine.connect(environmentNode, to: audioEngine.mainMixerNode, format: nil) environmentNode.listenerPosition = AVAudio3DPoint(x: 0, y: 0, z: 0) environmentNode.listenerAngularOrientation = AVAudio3DAngularOrientation(yaw: 0, pitch: 0, roll: 0) environmentNode.distanceAttenuationParameters.referenceDistance = 1.0 environmentNode.distanceAttenuationParameters.maximumDistance = 100.0 environmentNode.distanceAttenuationParameters.rolloffFactor = 2.0 // example.mp3 is mono sound guard let audioURL = Bundle.main.url(forResource: "example", withExtension: "mp3") else { print("Audio file not found") return } do { audioFile = try AVAudioFile(forReading: audioURL) } catch { print("Failed to load audio file: \(error)") } } ... //Playing sound func playSpatialAudio(pan: Float ) { guard let audioFile = audioFile else { return } // left side playerNode.position = AVAudio3DPoint(x: pan, y: 0, z: 0) playerNode.scheduleFile(audioFile, at: nil, completionHandler: nil) do { try audioEngine.start() playerNode.play() } catch { print("Failed to start audio engine: \(error)") } ... } Second more complex approach using PHASE did better. I’ve made an exemplary app that allows players to move audio player in 3D space. I have added reverb, and sliders changing audio position up to 10 meters each direction from listener but audio seems to only really change left to right (x axis) - again I think it might be trouble with the app not being recognized as spatial. //Crucial class Variables: class PHASEAudioController: ObservableObject{ private var soundSourcePosition: simd_float4x4 = matrix_identity_float4x4 private var audioAsset: PHASESoundAsset! private let phaseEngine: PHASEEngine private let params = PHASEMixerParameters() private var soundSource: PHASESource private var phaseListener: PHASEListener! private var soundEventAsset: PHASESoundEventNodeAsset? // Initialization of PHASE init{ do { let session = AVAudioSession.sharedInstance() try session.setCategory(.playback, mode: .default, options: []) try session.setActive(true) } catch { print("Failed to configure AVAudioSession: \(error.localizedDescription)") } // Init PHASE Engine phaseEngine = PHASEEngine(updateMode: .automatic) phaseEngine.defaultReverbPreset = .mediumHall phaseEngine.outputSpatializationMode = .automatic //nothing helps // Set listener position to (0,0,0) in World space let origin: simd_float4x4 = matrix_identity_float4x4 phaseListener = PHASEListener(engine: phaseEngine) phaseListener.transform = origin phaseListener.automaticHeadTrackingFlags = .orientation try! self.phaseEngine.rootObject.addChild(self.phaseListener) do{ try self.phaseEngine.start(); } catch { print("Could not start PHASE engine") } audioAsset = loadAudioAsset() // Create sound Source // Sphere soundSourcePosition.translate(z:3.0) let sphere = MDLMesh.newEllipsoid(withRadii: vector_float3(0.1,0.1,0.1), radialSegments: 14, verticalSegments: 14, geometryType: MDLGeometryType.triangles, inwardNormals: false, hemisphere: false, allocator: nil) let shape = PHASEShape(engine: phaseEngine, mesh: sphere) soundSource = PHASESource(engine: phaseEngine, shapes: [shape]) soundSource.transform = soundSourcePosition print(soundSourcePosition) do { try phaseEngine.rootObject.addChild(soundSource) } catch { print ("Failed to add a child object to the scene.") } let simpleModel = PHASEGeometricSpreadingDistanceModelParameters() simpleModel.rolloffFactor = rolloffFactor soundPipeline.distanceModelParameters = simpleModel let samplerNode = PHASESamplerNodeDefinition( soundAssetIdentifier: audioAsset.identifier, mixerDefinition: soundPipeline, identifier: audioAsset.identifier + "_SamplerNode") samplerNode.playbackMode = .looping do {soundEventAsset = try phaseEngine.assetRegistry.registerSoundEventAsset( rootNode: samplerNode, identifier: audioAsset.identifier + "_SoundEventAsset") } catch { print("Failed to register a sound event asset.") soundEventAsset = nil } } //Playing sound func playSound(){ // Fire new sound event with currently set properties guard let soundEventAsset else { return } params.addSpatialMixerParameters( identifier: soundPipeline.identifier, source: soundSource, listener: phaseListener) let soundEvent = try! PHASESoundEvent(engine: phaseEngine, assetIdentifier: soundEventAsset.identifier, mixerParameters: params) soundEvent.start(completion: nil) } ... } Also worth mentioning might be that I only own personal team account
4
0
1.2k
Nov ’25
Correct way for an Audio Unit v3 to return fewer than requested number of samples given a buffer
I have an AUv3 plugin which uses an FFT - which requires n samples before it can produce any output - so, depending on the relation between the host's buffer size and the FFT window size, it may receive a several buffers of samples, producing no output, and then dumping out what it has once a sufficient number of samples have been received. This means that output is produced in fits and starts, in batches that match the FFT size (modulo oversampling) - e.g. if being fed buffers of 256 samples with an fft size of 1024, the output buffer sizes will be 0 for the first 3 buffers, and upon the fourth, the first 256 processed samples are returned and the remaining 768 cached; the next three buffers will return the remaining cached samples while processing and buffering subsequent ones, and so forth. The internal mechanics of that I have solved, caching output if the current output buffer is too small, and so forth - so it all works as advertised, and the plugin reports its latency correctly. And when run as an app in demo-mode, playback works as expected. In the plugin's render block, it captures the number of frames written, and if it is less than the number of frames passed in, adjusts the mDataByteSize of the output buffers to match the actual quantity of data being returned: unsigned int framesWritten = (unsigned int) processHelper->processWithEvents(inAudioBufferList, outAudioBufferList, timestamp, frameCount, realtimeEventListHead); if (framesWritten < frameCount) { for (UInt32 i = 0; i < outAudioBufferList->mNumberBuffers; ++i) { outAudioBufferList->mBuffers[i].mDataByteSize = framesWritten * 4; // assume 4 byte floats } } However, there are a couple of serious issues: auval -v fails it with - Render Test at 64 frames, sample rate: 22050 Hz ERROR: Output Buffer Size does not match requested When connected to Logic Pro, it appears that mDataByteSize is ignored, and the entire allocated buffer is read - audio has sections of silence snipped into it which corresponds the number of empty buffers being returned If I set Logic's buffer size to 1024 and use a 1024 sample FFT window, the plugin works correctly - but of course a plugin cannot dictate buffer size, and `1024 is too small a window size to be useful for anything but filtering very high frequencies This seems like it has to be a solvable problem, and most likely the issue is in how my code reports the number of usable samples in the returned buffer. So, what is the correct way for a plugin to report that it has no samples to return, but will, uh, real soon now? I know I could convert this plugin to be one that does offline rendering of the entire input, but this is real-time processing, just with a fixed amount of latency, so that should not be necessary.
0
0
517
Nov ’25
SpeechTranscriber supported Devices
I have the new iOS 26 SpeechTranscriber working in my application. The issue I am facing is how to determine if the device I am running on supports SpeechTranscriber. I was able to create code that tests if the device supports transcription but it takes a bit of time to run and thus the results are not available when the app launches. What I am looking for is a list of what iOS 26 devices it doesn't run on. I think its safe to assume any new devices will support it so if we can just have a list of what devices that can run iOS 26 and not able to do transcription it would be much faster for the app. I have determined it doesn't work on a SE 2nd Gen, it works on iPhone 12, SE 3rd Gen, iPhone 14 Pro, 15 Pro. As the SpeechTranscriber doesn't work in the simulator I can't determine that way. I have checked the docs and it doesn't list the devices it doesn't work on.
1
0
573
Nov ’25
occasional glitches and empty buffers when using AudioFileStream + AVAudioConverter
I'm streaming mp3 audio data using URLSession/AudioFileStream/AVAudioConverter and getting occasional silent buffers and glitches (little bleeps and whoops as opposed to clicks). The issues are present in an offline test, so this isn't an issue of underruns. Doing some buffering on the input coming from the URLSession (URLSessionDataTask) reduces the glitches/silent buffers to rather infrequent, but they do still happen occasionally. var bufferedData = Data() func parseBytes(data: Data) { bufferedData.append(data) // XXX: this buffering reduces glitching // to rather infrequent. But why? if bufferedData.count > 32768 { bufferedData.withUnsafeBytes { (bytes: UnsafeRawBufferPointer) in guard let baseAddress = bytes.baseAddress else { return } let result = AudioFileStreamParseBytes(audioStream!, UInt32(bufferedData.count), baseAddress, []) if result != noErr { print("❌ error parsing stream: \(result)") } } bufferedData = Data() } } No errors are returned by AudioFileStream or AVAudioConverter. func handlePackets(data: Data, packetDescriptions: [AudioStreamPacketDescription]) { guard let audioConverter else { return } var maxPacketSize: UInt32 = 0 for packetDescription in packetDescriptions { maxPacketSize = max(maxPacketSize, packetDescription.mDataByteSize) if packetDescription.mDataByteSize == 0 { print("EMPTY PACKET") } if Int(packetDescription.mStartOffset) + Int(packetDescription.mDataByteSize) > data.count { print("❌ Invalid packet: offset \(packetDescription.mStartOffset) + size \(packetDescription.mDataByteSize) > data.count \(data.count)") } } let bufferIn = AVAudioCompressedBuffer(format: inFormat!, packetCapacity: AVAudioPacketCount(packetDescriptions.count), maximumPacketSize: Int(maxPacketSize)) bufferIn.byteLength = UInt32(data.count) for i in 0 ..< Int(packetDescriptions.count) { bufferIn.packetDescriptions![i] = packetDescriptions[i] } bufferIn.packetCount = AVAudioPacketCount(packetDescriptions.count) _ = data.withUnsafeBytes { ptr in memcpy(bufferIn.data, ptr.baseAddress, data.count) } if verbose { print("handlePackets: \(data.count) bytes") } // Setup input provider closure var inputProvided = false let inputBlock: AVAudioConverterInputBlock = { packetCount, statusPtr in if !inputProvided { inputProvided = true statusPtr.pointee = .haveData return bufferIn } else { statusPtr.pointee = .noDataNow return nil } } // Loop until converter runs dry or is done while true { let bufferOut = AVAudioPCMBuffer(pcmFormat: outFormat, frameCapacity: 4096)! bufferOut.frameLength = 0 var error: NSError? let status = audioConverter.convert(to: bufferOut, error: &error, withInputFrom: inputBlock) switch status { case .haveData: if verbose { print("✅ convert returned haveData: \(bufferOut.frameLength) frames") } if bufferOut.frameLength > 0 { if bufferOut.isSilent { print("(haveData) SILENT BUFFER at frame \(totalFrames), pending: \(pendingFrames), inputPackets=\(bufferIn.packetCount), outputFrames=\(bufferOut.frameLength)") } outBuffers.append(bufferOut) totalFrames += Int(bufferOut.frameLength) } case .inputRanDry: if verbose { print("🔁 convert returned inputRanDry: \(bufferOut.frameLength) frames") } if bufferOut.frameLength > 0 { if bufferOut.isSilent { print("(inputRanDry) SILENT BUFFER at frame \(totalFrames), pending: \(pendingFrames), inputPackets=\(bufferIn.packetCount), outputFrames=\(bufferOut.frameLength)") } outBuffers.append(bufferOut) totalFrames += Int(bufferOut.frameLength) } return // wait for next handlePackets case .endOfStream: if verbose { print("✅ convert returned endOfStream") } return case .error: if verbose { print("❌ convert returned error") } if let error = error { print("error converting: \(error.localizedDescription)") } return @unknown default: fatalError() } } }
0
0
586
Jul ’25
Mixing ScreenCaptureKit audio with microphone audio
Hi, I'm new to AVAudioEngine(and macOS programming in general). I'm trying to mix microphone audio with ScreenCaptureKit audio using AVAudioEngine without playing it back. I've created a AVAudioPlayerNode and scheduling buffers in my SCStream handler: playerNode.scheduleBuffer(samples) and have connected the playerNode to the mainMixerNode. audioEngine.connect(audioEngine.inputNode, to: audioEngine.mainMixerNode, format: micFormat) audioEngine.connect(playerNode, to: audioEngine.mainMixerNode, format: format) The problem is that mainMixerNode plays the audio to the speaker creating a feedback loop. How can I prevent the mixer output from being played back. Also: Is this the best way of mixing microphone input with some other input? I ran into AVAudioEngine's manual rendering mode, which seems like the way to go for mixing audio without playing it back. However, I couldn't figure out how to connect microphone input to the AVAudioEngine in manual rendering mode?
1
0
1.2k
Mar ’26
What is the best approach to multi-channel, per-channel volume control.
I've got a setup using AVAudioEngine with several tone generator nodes, each with a chain of processing nodes, the chains then mixed into the main output. Generator ➡️ Effect ➡️... ➡️ .mainMixerNode ➡️ .outputNode). Generator ➡️ Effect ➡️... ⤴️ ... Generator ➡️ Effect ➡️... ⤴️ The user should be able to mute any chain individually. I've found several potential approaches to muting, but not terribly happy with any of them. Adjust the amplitudes directly in my tone generators. Issue: Consumes CPU even when completely muted. 4 generators adds ~15% cpu, even when all chains are muted. Detach/attach chains that are muted/unmuted. Issue: Causes loud clicking/popping sounds whenever muted/unmuted. Fade mixer output volume while detaching/attaching a chain (just cutting the volume immediately to 0 doesn't get rid of the clicking/popping). Issue: Causes all channels to fade during the transition, so not ideal. The rest of these ideas are variations on making volume control+detatch/attach work for individual chains, since approach #3 worked well. Add an AVAudioMixer to the end of each chain (just for volume control). Issue: Only the mixer on the final chain functions -- the others block all output. Not sure what's going on there. Use matrix mixer (for multi-input volume control). Plus detach/attach to reduce CPU if necessary. Not yet attempted, due to perceived complexity and reports of fragility in order of wiring in. A bunch of effort before I even know if it's going to work. Develop my own fader node to put on the end of each channel. Unlike the tone generator (simple AVSourceNode), developing an effect node seems complex and time consuming. Might not even fix CPU use. I'm not completely averse to the learning curve of either 5 or 6, but would rather get some guidance on best approach before diving in. They both seem likely to take more effort than I'd like for the simple behavior I'm trying to achieve.
0
0
451
Jul ’25
Logic Pro - discover channel upstream latency
Hello everyone, I've written an audio unit plugin that needs to be aware of any upstream latency caused by heavy plugins before it on the channel. Is there any way to query this? I know that Logic applies PDC at the channel's output (summing point), but I need to know what the accumulated latency is at the point the audio enters my plugin. Thanks!
0
0
359
Jan ’26
ShazamKit Background Operation Broken on iOS 18 - SHManagedSession Stops Working After ~20 Seconds
Your draft looks great! Here's a refined version with the iOS 17 comparison emphasized and slightly better flow: Hi Apple Engineers and fellow developers, I'm experiencing a critical regression with ShazamKit's background operation on iOS 18. ShazamKit's SHManagedSession stops identifying songs in the background after approximately 20 seconds on iOS 18, while the exact same code works perfectly on iOS 17. The behavior is consistent: the app works perfectly in the foreground, but when backgrounded or device is locked, it initially works for about 20 seconds then stops identifying new songs. The microphone indicator remains active suggesting audio access is maintained, but ShazamKit doesn't send identified songs in the background until you open the app again. Detection immediately resumes when bringing the app to foreground. My technical setup uses SHManagedSession for continuous matching with background modes properly configured in Info.plist including audio mode, and Background App Refresh enabled. I've tested this on physical devices running iOS 18.0 through 18.5 with the same results across all versions. The exact same code running on iOS 17 devices works flawlessly in the background. To reproduce: initialize SHManagedSession and start matching, begin song identification in foreground, background the app or lock device, play different songs which are initially detected for about 20 seconds, then after the timeout period new songs are no longer identified until you bring the app to foreground. This regression has impacted my production app as users who rely on continuous background music identification are experiencing a broken feature. I submitted this as Feedback ID FB15255903 last September with no solution so far. I've created a minimal demo project that reproduces this issue: https://github.com/tfmart/ShazamKitBackground Has anyone else experienced this ShazamKit background regression on iOS 18? Are there any known workarounds or alternative approaches? Given the time this issue has persisted, could we please get acknowledgment of this regression, expected timeline for a fix, or any recommended workarounds? Testing environment is Xcode 16.0+ on iOS 18.0-18.5 across multiple physical device models. Any guidance would be greatly appreciated.
1
0
417
Jan ’26
Hybrid Wired-to-Wireless Audio Mode Using AirPods Charging Case
Many Apple users own both Bluetooth earphones (AirPods) and traditional wired earphones. While Bluetooth audio provides freedom of movement, some users still prefer wired earphones for comfort, sound profile, or personal preference. However, plugging wired earphones directly into an iPhone can feel restrictive and inconvenient during daily use. This proposal suggests a hybrid audio approach where wired earphones can be connected to a Bluetooth-enabled AirPods charging case (or a similar Apple-designed module), allowing users to enjoy wired earphones without a physical connection to the iPhone. #Problem Statement *Wired earphones offer consistent audio quality and zero latency *Bluetooth earphones provide freedom from cables *Users must currently choose one or the other *Plugging wired earphones into an iPhone limits movement and can feel intrusive in daily scenarios (walking, commuting, working) There is no native Apple solution that allows wired earphones to function wirelessly while maintaining Apple’s audio experience standards. #Proposed Solution Introduce a Wired-to-Wireless Audio Mode through the AirPods charging case or a dedicated Apple Bluetooth audio bridge. How it works: User plugs wired earphones into the AirPods case (or a future AirPods accessory port) The case acts as a Bluetooth audio transmitter Audio is streamed wirelessly from iPhone to the case The case outputs audio to the wired earphones #User experiences: No cable connected to the iPhone Familiar wired earphone sound Freedom of movement similar to Bluetooth earbuds User Experience (UX Flow) Plug wired earphones into the AirPods case iPhone automatically detects: “Wired Earphones via AirPods Case” Seamless pairing using existing AirPods framework Audio controls, volume, and switching handled through iOS No additional apps required #Key Benefits Combines wired sound reliability with wireless convenience Reduces physical cable disturbance during use Extends usefulness of existing wired earphones Minimal learning curve for users Fits naturally into Apple’s ecosystem and design philosophy #Privacy & Performance Considerations On-device audio processing only No cloud involvement Low-latency audio using Apple’s proprietary Bluetooth codecs Power-efficient usage leveraging AirPods case battery #Target Users Users who prefer wired earphones but want wireless freedom Commuters and walkers Developers and professionals who multitask Users sensitive to Bluetooth earbud fit or comfort #Ecosystem Fit Builds on existing AirPods pairing and audio stack Aligns with Apple’s focus on seamless UX Could be implemented via: New AirPods hardware Firmware update + accessory Dedicated Apple audio bridge
1
0
337
Jan ’26
AVAudioEngine Voice Processing Fails with Mismatched Input/Output Devices: AggregateDevice Channel Count Mismatch
I'm encountering errors while using AVAudioEngine with voice processing enabled (setVoiceProcessingEnabled(true)) in scenarios where the input and output audio devices are not the same. This issue arises specifically with mismatched devices, preventing the application from functioning as expected. Works: Paired devices (e.g., MacBook Pro mic → MacBook Pro speakers) Fails: Mismatched devices (e.g., AirPods mic → MacBook Pro speakers) When using paired input and output devices: The setup works as expected. Example: MacBook Pro microphone → MacBook Pro speakers. When using mismatched devices: AVAudioEngine setup fails during aggregate device construction. Example: AirPods microphone → MacBook Pro speakers. Error logs indicate a channel count mismatch. Here are the partial logs. Due to the content limit, I cannot post the entire logs. AUVPAggregate.cpp:1000 client-side input and output formats do not match (err=-10875) AUVPAggregate.cpp:1036 err=-10875 AVAEInternal.h:109 [AVAudioEngineGraph.mm:1344:Initialize: (err = PerformCommand(*outputNode, kAUInitialize, NULL, 0)): error -10875 AggregateDevice.mm:329 Failed expectation of constructed aggregate (312): mInput.streamChannelCounts == inputStreamChannelCounts AggregateDevice.mm:331 Failed expectation of constructed aggregate (312): mInput.totalChannelCount == std::accumulate(inputStreamChannelCounts.begin(), inputStreamChannelCounts.end(), 0U) AggregateDevice.mm:182 error fetching default pair AggregateDevice.mm:329 Failed expectation of constructed aggregate (336): mInput.streamChannelCounts == inputStreamChannelCounts AggregateDevice.mm:331 Failed expectation of constructed aggregate (336): mInput.totalChannelCount == std::accumulate(inputStreamChannelCounts.begin(), inputStreamChannelCounts.end(), 0U) AUHAL.cpp:1782 ca_verify_noerr: [AudioDeviceSetProperty(mDeviceID, NULL, 0, isInput, kAudioDevicePropertyIOProcStreamUsage, theSize, theStreamUsage), 560227702] AudioHardware-mac-imp.cpp:3484 AudioDeviceSetProperty: no device with given ID AUHAL.cpp:1782 ca_verify_noerr: [AudioDeviceSetProperty(mDeviceID, NULL, 0, isInput, kAudioDevicePropertyIOProcStreamUsage, theSize, theStreamUsage), 560227702] AggregateDevice.mm:182 error fetching default pair AggregateDevice.mm:329 Failed expectation of constructed aggregate (348): mInput.streamChannelCounts == inputStreamChannelCounts AggregateDevice.mm:331 Failed expectation of constructed aggregate (348): mInput.totalChannelCount == std::accumulate(inputStreamChannelCounts.begin(), inputStreamChannelCounts.end(), 0U) Is it possible to use voice processing with different input/output devices? If yes, are there any specific configurations required to handle mismatched devices? How can we resolve channel count mismatch errors during aggregate device construction? Are there settings or API adjustments to enforce compatibility between input/output devices? Are there any workarounds or alternative approaches to achieve voice processing functionality with mismatched devices? For instance, can we force an intermediate channel configuration or downmix input/output formats?
0
0
360
Dec ’25
coreaudio-api mailing list search broken
Hello, The search functionality of the coreaudio-api mailing list archive has been broken for a very long time. Several of the lower-level audio APIs have only been discussed on this mailing list, making it critical for those of us maintaining old audio code. Steps to reproduce: Open https://lists.apple.com/archives/list/coreaudio-api@lists.apple.com/ in your web browser. Enter a search term in the "Search this list" field in the top-right corner of the page. The search will eventually time out with "502 Bad Gateway" Can somebody please forward this information to the current maintainer? I've tried to contact developer support but they weren't sure what to do. Thanks!
1
0
218
Feb ’26
Essentials of macOS to read and write mp3 and mp4 audio files
Hi, On macOS I used to open MP3 and MP4 files with ExtAudioFile. For a few years it doesn't work anymore. So I decided to try different macOS API using the AudioFileID of AudioToolbox framework. I decided to write a test: https://gist.github.com/joelkraehemann/7f5b241b52ca38c3a765c138fb647588 It fails right here: AudioFileOpenWithCallbacks() By telling OSStatus error 1954115647, which means kAudioFileUnsupportedFileTypeError. The filename was set to an MP4 file: ~/Music/test.mp4 Howto fix this? regards, Joël
1
0
648
Jun ’25
Live Translations on VOIP on iOS26
Hi team, With regards to Call (Live) Translations on VOIP: Is it possible to invoke live translations within the app? (without going into the Call System UI) Is it possible to navigate users from app to Call System UI via an API? (and also invoking the new live translations directly) Will Apple support more languages apart from the current ones? (Currently I see 4 supported languages)
1
0
180
Aug ’25
Question about Apple Vision Pro audio input sampling rate for research
I am a graduate student conducting research in speech/audio signal processing and multimodal interaction. Apple Vision Pro is widely recognized as a multimodal interactive system supporting voice, eye, and gesture inputs. However, I could not find detailed specifications or documentation about the audio input sampling rate used by the device’s built-in microphone array when capturing user audio. Specifically, I would like to understand: What is the default audio input sampling rate (e.g., 16 kHz, 44.1 kHz, 48 kHz, etc.) for the Vision Pro’s microphones? When developing with visionOS / AVAudioSession / AVAudioEngine, is there a documented or recommended sampling rate for audio capture? Are there any best practices or settings for enabling high-quality voice capture on Vision Pro (especially for voice research tasks)? For context, my work involves voice processing, analysis, and possibly on-device real-time speech recognition. Any pointers to relevant APIs, documentation or examples (especially regarding audio capture buffer size or available formats on visionOS) would be very helpful. Thank you in advance! Best regards.
0
0
202
Jan ’26
watchOS longFormAudio cannot de active
My workout watch app supports audio playback during exercise sessions. When users carry both Apple Watch, iPhone, and AirPods, with AirPods connected to the iPhone, I want to route audio from Apple Watch to AirPods for playback. I've implemented this functionality using the following code. try? session.setCategory(.playback, mode: .default, policy: .longFormAudio, options: []) try await session.activate() When users are playing music on iPhone and trigger my code in the watch app, Apple Watch correctly guides users to select AirPods, pauses the iPhone's music, and plays my audio. However, when playback finishes and I end the session using the code below: try session.setActive(false, options:[.notifyOthersOnDeactivation]) the iPhone doesn't automatically resume the previously interrupted music playback—it requires manual intervention. Is this expected behavior, or am I missing other important steps in my code?
Replies
1
Boosts
0
Views
350
Activity
Nov ’25
Spatial Audio on iOS 18 don't work as inteneded
I’m facing a problem while trying to achieve spatial audio effects in my iOS 18 app. I have tried several approaches to get good 3D audio, but the effect never felt good enough or it didn’t work at all. Also what mostly troubles me is I noticed that AirPods I have doesn’t recognize my app as one having spatial audio (in audio settings it shows "Spatial Audio Not Playing"). So i guess my app doesn't use spatial audio potential. First approach uses AVAudioEnviromentNode with AVAudioEngine. Chaining position of player as well as changing listener’s doesn’t seem to change anything in how audio plays. Here's simple how i initialize AVAudioEngine import Foundation import AVFoundation class AudioManager: ObservableObject { // important class variables var audioEngine: AVAudioEngine! var environmentNode: AVAudioEnvironmentNode! var playerNode: AVAudioPlayerNode! var audioFile: AVAudioFile? ... //Sound set up func setupAudio() { do { let session = AVAudioSession.sharedInstance() try session.setCategory(.playback, mode: .default, options: []) try session.setActive(true) } catch { print("Failed to configure AVAudioSession: \(error.localizedDescription)") } audioEngine = AVAudioEngine() environmentNode = AVAudioEnvironmentNode() playerNode = AVAudioPlayerNode() audioEngine.attach(environmentNode) audioEngine.attach(playerNode) audioEngine.connect(playerNode, to: environmentNode, format: nil) audioEngine.connect(environmentNode, to: audioEngine.mainMixerNode, format: nil) environmentNode.listenerPosition = AVAudio3DPoint(x: 0, y: 0, z: 0) environmentNode.listenerAngularOrientation = AVAudio3DAngularOrientation(yaw: 0, pitch: 0, roll: 0) environmentNode.distanceAttenuationParameters.referenceDistance = 1.0 environmentNode.distanceAttenuationParameters.maximumDistance = 100.0 environmentNode.distanceAttenuationParameters.rolloffFactor = 2.0 // example.mp3 is mono sound guard let audioURL = Bundle.main.url(forResource: "example", withExtension: "mp3") else { print("Audio file not found") return } do { audioFile = try AVAudioFile(forReading: audioURL) } catch { print("Failed to load audio file: \(error)") } } ... //Playing sound func playSpatialAudio(pan: Float ) { guard let audioFile = audioFile else { return } // left side playerNode.position = AVAudio3DPoint(x: pan, y: 0, z: 0) playerNode.scheduleFile(audioFile, at: nil, completionHandler: nil) do { try audioEngine.start() playerNode.play() } catch { print("Failed to start audio engine: \(error)") } ... } Second more complex approach using PHASE did better. I’ve made an exemplary app that allows players to move audio player in 3D space. I have added reverb, and sliders changing audio position up to 10 meters each direction from listener but audio seems to only really change left to right (x axis) - again I think it might be trouble with the app not being recognized as spatial. //Crucial class Variables: class PHASEAudioController: ObservableObject{ private var soundSourcePosition: simd_float4x4 = matrix_identity_float4x4 private var audioAsset: PHASESoundAsset! private let phaseEngine: PHASEEngine private let params = PHASEMixerParameters() private var soundSource: PHASESource private var phaseListener: PHASEListener! private var soundEventAsset: PHASESoundEventNodeAsset? // Initialization of PHASE init{ do { let session = AVAudioSession.sharedInstance() try session.setCategory(.playback, mode: .default, options: []) try session.setActive(true) } catch { print("Failed to configure AVAudioSession: \(error.localizedDescription)") } // Init PHASE Engine phaseEngine = PHASEEngine(updateMode: .automatic) phaseEngine.defaultReverbPreset = .mediumHall phaseEngine.outputSpatializationMode = .automatic //nothing helps // Set listener position to (0,0,0) in World space let origin: simd_float4x4 = matrix_identity_float4x4 phaseListener = PHASEListener(engine: phaseEngine) phaseListener.transform = origin phaseListener.automaticHeadTrackingFlags = .orientation try! self.phaseEngine.rootObject.addChild(self.phaseListener) do{ try self.phaseEngine.start(); } catch { print("Could not start PHASE engine") } audioAsset = loadAudioAsset() // Create sound Source // Sphere soundSourcePosition.translate(z:3.0) let sphere = MDLMesh.newEllipsoid(withRadii: vector_float3(0.1,0.1,0.1), radialSegments: 14, verticalSegments: 14, geometryType: MDLGeometryType.triangles, inwardNormals: false, hemisphere: false, allocator: nil) let shape = PHASEShape(engine: phaseEngine, mesh: sphere) soundSource = PHASESource(engine: phaseEngine, shapes: [shape]) soundSource.transform = soundSourcePosition print(soundSourcePosition) do { try phaseEngine.rootObject.addChild(soundSource) } catch { print ("Failed to add a child object to the scene.") } let simpleModel = PHASEGeometricSpreadingDistanceModelParameters() simpleModel.rolloffFactor = rolloffFactor soundPipeline.distanceModelParameters = simpleModel let samplerNode = PHASESamplerNodeDefinition( soundAssetIdentifier: audioAsset.identifier, mixerDefinition: soundPipeline, identifier: audioAsset.identifier + "_SamplerNode") samplerNode.playbackMode = .looping do {soundEventAsset = try phaseEngine.assetRegistry.registerSoundEventAsset( rootNode: samplerNode, identifier: audioAsset.identifier + "_SoundEventAsset") } catch { print("Failed to register a sound event asset.") soundEventAsset = nil } } //Playing sound func playSound(){ // Fire new sound event with currently set properties guard let soundEventAsset else { return } params.addSpatialMixerParameters( identifier: soundPipeline.identifier, source: soundSource, listener: phaseListener) let soundEvent = try! PHASESoundEvent(engine: phaseEngine, assetIdentifier: soundEventAsset.identifier, mixerParameters: params) soundEvent.start(completion: nil) } ... } Also worth mentioning might be that I only own personal team account
Replies
4
Boosts
0
Views
1.2k
Activity
Nov ’25
Correct way for an Audio Unit v3 to return fewer than requested number of samples given a buffer
I have an AUv3 plugin which uses an FFT - which requires n samples before it can produce any output - so, depending on the relation between the host's buffer size and the FFT window size, it may receive a several buffers of samples, producing no output, and then dumping out what it has once a sufficient number of samples have been received. This means that output is produced in fits and starts, in batches that match the FFT size (modulo oversampling) - e.g. if being fed buffers of 256 samples with an fft size of 1024, the output buffer sizes will be 0 for the first 3 buffers, and upon the fourth, the first 256 processed samples are returned and the remaining 768 cached; the next three buffers will return the remaining cached samples while processing and buffering subsequent ones, and so forth. The internal mechanics of that I have solved, caching output if the current output buffer is too small, and so forth - so it all works as advertised, and the plugin reports its latency correctly. And when run as an app in demo-mode, playback works as expected. In the plugin's render block, it captures the number of frames written, and if it is less than the number of frames passed in, adjusts the mDataByteSize of the output buffers to match the actual quantity of data being returned: unsigned int framesWritten = (unsigned int) processHelper->processWithEvents(inAudioBufferList, outAudioBufferList, timestamp, frameCount, realtimeEventListHead); if (framesWritten < frameCount) { for (UInt32 i = 0; i < outAudioBufferList->mNumberBuffers; ++i) { outAudioBufferList->mBuffers[i].mDataByteSize = framesWritten * 4; // assume 4 byte floats } } However, there are a couple of serious issues: auval -v fails it with - Render Test at 64 frames, sample rate: 22050 Hz ERROR: Output Buffer Size does not match requested When connected to Logic Pro, it appears that mDataByteSize is ignored, and the entire allocated buffer is read - audio has sections of silence snipped into it which corresponds the number of empty buffers being returned If I set Logic's buffer size to 1024 and use a 1024 sample FFT window, the plugin works correctly - but of course a plugin cannot dictate buffer size, and `1024 is too small a window size to be useful for anything but filtering very high frequencies This seems like it has to be a solvable problem, and most likely the issue is in how my code reports the number of usable samples in the returned buffer. So, what is the correct way for a plugin to report that it has no samples to return, but will, uh, real soon now? I know I could convert this plugin to be one that does offline rendering of the entire input, but this is real-time processing, just with a fixed amount of latency, so that should not be necessary.
Replies
0
Boosts
0
Views
517
Activity
Nov ’25
Apple Music for DJ App
Hi there, I recently launched a dj app to the mac app store, and was wondering how I could access songs for mixing purposes via Apple Music just like how serato, rekordbox, djay, and other DJ apps do? Thanks, Gunek
Replies
0
Boosts
0
Views
544
Activity
Nov ’25
Why is MusicKit ApplicationMusicPlayer not available on watchOS?
ApplicationMusicPlayer is not available on watchOS but all other platforms. Is there a technical reason for that like battery life? Same goes for SystemMusicPlayer and MPMusicPlayerController. I already filed feedbacks for that.
Replies
0
Boosts
0
Views
120
Activity
May ’25
SpeechTranscriber supported Devices
I have the new iOS 26 SpeechTranscriber working in my application. The issue I am facing is how to determine if the device I am running on supports SpeechTranscriber. I was able to create code that tests if the device supports transcription but it takes a bit of time to run and thus the results are not available when the app launches. What I am looking for is a list of what iOS 26 devices it doesn't run on. I think its safe to assume any new devices will support it so if we can just have a list of what devices that can run iOS 26 and not able to do transcription it would be much faster for the app. I have determined it doesn't work on a SE 2nd Gen, it works on iPhone 12, SE 3rd Gen, iPhone 14 Pro, 15 Pro. As the SpeechTranscriber doesn't work in the simulator I can't determine that way. I have checked the docs and it doesn't list the devices it doesn't work on.
Replies
1
Boosts
0
Views
573
Activity
Nov ’25
occasional glitches and empty buffers when using AudioFileStream + AVAudioConverter
I'm streaming mp3 audio data using URLSession/AudioFileStream/AVAudioConverter and getting occasional silent buffers and glitches (little bleeps and whoops as opposed to clicks). The issues are present in an offline test, so this isn't an issue of underruns. Doing some buffering on the input coming from the URLSession (URLSessionDataTask) reduces the glitches/silent buffers to rather infrequent, but they do still happen occasionally. var bufferedData = Data() func parseBytes(data: Data) { bufferedData.append(data) // XXX: this buffering reduces glitching // to rather infrequent. But why? if bufferedData.count > 32768 { bufferedData.withUnsafeBytes { (bytes: UnsafeRawBufferPointer) in guard let baseAddress = bytes.baseAddress else { return } let result = AudioFileStreamParseBytes(audioStream!, UInt32(bufferedData.count), baseAddress, []) if result != noErr { print("❌ error parsing stream: \(result)") } } bufferedData = Data() } } No errors are returned by AudioFileStream or AVAudioConverter. func handlePackets(data: Data, packetDescriptions: [AudioStreamPacketDescription]) { guard let audioConverter else { return } var maxPacketSize: UInt32 = 0 for packetDescription in packetDescriptions { maxPacketSize = max(maxPacketSize, packetDescription.mDataByteSize) if packetDescription.mDataByteSize == 0 { print("EMPTY PACKET") } if Int(packetDescription.mStartOffset) + Int(packetDescription.mDataByteSize) > data.count { print("❌ Invalid packet: offset \(packetDescription.mStartOffset) + size \(packetDescription.mDataByteSize) > data.count \(data.count)") } } let bufferIn = AVAudioCompressedBuffer(format: inFormat!, packetCapacity: AVAudioPacketCount(packetDescriptions.count), maximumPacketSize: Int(maxPacketSize)) bufferIn.byteLength = UInt32(data.count) for i in 0 ..< Int(packetDescriptions.count) { bufferIn.packetDescriptions![i] = packetDescriptions[i] } bufferIn.packetCount = AVAudioPacketCount(packetDescriptions.count) _ = data.withUnsafeBytes { ptr in memcpy(bufferIn.data, ptr.baseAddress, data.count) } if verbose { print("handlePackets: \(data.count) bytes") } // Setup input provider closure var inputProvided = false let inputBlock: AVAudioConverterInputBlock = { packetCount, statusPtr in if !inputProvided { inputProvided = true statusPtr.pointee = .haveData return bufferIn } else { statusPtr.pointee = .noDataNow return nil } } // Loop until converter runs dry or is done while true { let bufferOut = AVAudioPCMBuffer(pcmFormat: outFormat, frameCapacity: 4096)! bufferOut.frameLength = 0 var error: NSError? let status = audioConverter.convert(to: bufferOut, error: &error, withInputFrom: inputBlock) switch status { case .haveData: if verbose { print("✅ convert returned haveData: \(bufferOut.frameLength) frames") } if bufferOut.frameLength > 0 { if bufferOut.isSilent { print("(haveData) SILENT BUFFER at frame \(totalFrames), pending: \(pendingFrames), inputPackets=\(bufferIn.packetCount), outputFrames=\(bufferOut.frameLength)") } outBuffers.append(bufferOut) totalFrames += Int(bufferOut.frameLength) } case .inputRanDry: if verbose { print("🔁 convert returned inputRanDry: \(bufferOut.frameLength) frames") } if bufferOut.frameLength > 0 { if bufferOut.isSilent { print("(inputRanDry) SILENT BUFFER at frame \(totalFrames), pending: \(pendingFrames), inputPackets=\(bufferIn.packetCount), outputFrames=\(bufferOut.frameLength)") } outBuffers.append(bufferOut) totalFrames += Int(bufferOut.frameLength) } return // wait for next handlePackets case .endOfStream: if verbose { print("✅ convert returned endOfStream") } return case .error: if verbose { print("❌ convert returned error") } if let error = error { print("error converting: \(error.localizedDescription)") } return @unknown default: fatalError() } } }
Replies
0
Boosts
0
Views
586
Activity
Jul ’25
Mixing ScreenCaptureKit audio with microphone audio
Hi, I'm new to AVAudioEngine(and macOS programming in general). I'm trying to mix microphone audio with ScreenCaptureKit audio using AVAudioEngine without playing it back. I've created a AVAudioPlayerNode and scheduling buffers in my SCStream handler: playerNode.scheduleBuffer(samples) and have connected the playerNode to the mainMixerNode. audioEngine.connect(audioEngine.inputNode, to: audioEngine.mainMixerNode, format: micFormat) audioEngine.connect(playerNode, to: audioEngine.mainMixerNode, format: format) The problem is that mainMixerNode plays the audio to the speaker creating a feedback loop. How can I prevent the mixer output from being played back. Also: Is this the best way of mixing microphone input with some other input? I ran into AVAudioEngine's manual rendering mode, which seems like the way to go for mixing audio without playing it back. However, I couldn't figure out how to connect microphone input to the AVAudioEngine in manual rendering mode?
Replies
1
Boosts
0
Views
1.2k
Activity
Mar ’26
Audio Unit MIDI Plugin documentation
Hi folks - I'm having trouble finding specific documentation about Audio Unit MIDI plugins - as in MIDI -only. Any suggestions welcome as searches aren't returning much. (too niche? user error?)
Replies
0
Boosts
0
Views
153
Activity
Dec ’25
What is the best approach to multi-channel, per-channel volume control.
I've got a setup using AVAudioEngine with several tone generator nodes, each with a chain of processing nodes, the chains then mixed into the main output. Generator ➡️ Effect ➡️... ➡️ .mainMixerNode ➡️ .outputNode). Generator ➡️ Effect ➡️... ⤴️ ... Generator ➡️ Effect ➡️... ⤴️ The user should be able to mute any chain individually. I've found several potential approaches to muting, but not terribly happy with any of them. Adjust the amplitudes directly in my tone generators. Issue: Consumes CPU even when completely muted. 4 generators adds ~15% cpu, even when all chains are muted. Detach/attach chains that are muted/unmuted. Issue: Causes loud clicking/popping sounds whenever muted/unmuted. Fade mixer output volume while detaching/attaching a chain (just cutting the volume immediately to 0 doesn't get rid of the clicking/popping). Issue: Causes all channels to fade during the transition, so not ideal. The rest of these ideas are variations on making volume control+detatch/attach work for individual chains, since approach #3 worked well. Add an AVAudioMixer to the end of each chain (just for volume control). Issue: Only the mixer on the final chain functions -- the others block all output. Not sure what's going on there. Use matrix mixer (for multi-input volume control). Plus detach/attach to reduce CPU if necessary. Not yet attempted, due to perceived complexity and reports of fragility in order of wiring in. A bunch of effort before I even know if it's going to work. Develop my own fader node to put on the end of each channel. Unlike the tone generator (simple AVSourceNode), developing an effect node seems complex and time consuming. Might not even fix CPU use. I'm not completely averse to the learning curve of either 5 or 6, but would rather get some guidance on best approach before diving in. They both seem likely to take more effort than I'd like for the simple behavior I'm trying to achieve.
Replies
0
Boosts
0
Views
451
Activity
Jul ’25
Logic Pro - discover channel upstream latency
Hello everyone, I've written an audio unit plugin that needs to be aware of any upstream latency caused by heavy plugins before it on the channel. Is there any way to query this? I know that Logic applies PDC at the channel's output (summing point), but I need to know what the accumulated latency is at the point the audio enters my plugin. Thanks!
Replies
0
Boosts
0
Views
359
Activity
Jan ’26
ShazamKit Background Operation Broken on iOS 18 - SHManagedSession Stops Working After ~20 Seconds
Your draft looks great! Here's a refined version with the iOS 17 comparison emphasized and slightly better flow: Hi Apple Engineers and fellow developers, I'm experiencing a critical regression with ShazamKit's background operation on iOS 18. ShazamKit's SHManagedSession stops identifying songs in the background after approximately 20 seconds on iOS 18, while the exact same code works perfectly on iOS 17. The behavior is consistent: the app works perfectly in the foreground, but when backgrounded or device is locked, it initially works for about 20 seconds then stops identifying new songs. The microphone indicator remains active suggesting audio access is maintained, but ShazamKit doesn't send identified songs in the background until you open the app again. Detection immediately resumes when bringing the app to foreground. My technical setup uses SHManagedSession for continuous matching with background modes properly configured in Info.plist including audio mode, and Background App Refresh enabled. I've tested this on physical devices running iOS 18.0 through 18.5 with the same results across all versions. The exact same code running on iOS 17 devices works flawlessly in the background. To reproduce: initialize SHManagedSession and start matching, begin song identification in foreground, background the app or lock device, play different songs which are initially detected for about 20 seconds, then after the timeout period new songs are no longer identified until you bring the app to foreground. This regression has impacted my production app as users who rely on continuous background music identification are experiencing a broken feature. I submitted this as Feedback ID FB15255903 last September with no solution so far. I've created a minimal demo project that reproduces this issue: https://github.com/tfmart/ShazamKitBackground Has anyone else experienced this ShazamKit background regression on iOS 18? Are there any known workarounds or alternative approaches? Given the time this issue has persisted, could we please get acknowledgment of this regression, expected timeline for a fix, or any recommended workarounds? Testing environment is Xcode 16.0+ on iOS 18.0-18.5 across multiple physical device models. Any guidance would be greatly appreciated.
Replies
1
Boosts
0
Views
417
Activity
Jan ’26
Hybrid Wired-to-Wireless Audio Mode Using AirPods Charging Case
Many Apple users own both Bluetooth earphones (AirPods) and traditional wired earphones. While Bluetooth audio provides freedom of movement, some users still prefer wired earphones for comfort, sound profile, or personal preference. However, plugging wired earphones directly into an iPhone can feel restrictive and inconvenient during daily use. This proposal suggests a hybrid audio approach where wired earphones can be connected to a Bluetooth-enabled AirPods charging case (or a similar Apple-designed module), allowing users to enjoy wired earphones without a physical connection to the iPhone. #Problem Statement *Wired earphones offer consistent audio quality and zero latency *Bluetooth earphones provide freedom from cables *Users must currently choose one or the other *Plugging wired earphones into an iPhone limits movement and can feel intrusive in daily scenarios (walking, commuting, working) There is no native Apple solution that allows wired earphones to function wirelessly while maintaining Apple’s audio experience standards. #Proposed Solution Introduce a Wired-to-Wireless Audio Mode through the AirPods charging case or a dedicated Apple Bluetooth audio bridge. How it works: User plugs wired earphones into the AirPods case (or a future AirPods accessory port) The case acts as a Bluetooth audio transmitter Audio is streamed wirelessly from iPhone to the case The case outputs audio to the wired earphones #User experiences: No cable connected to the iPhone Familiar wired earphone sound Freedom of movement similar to Bluetooth earbuds User Experience (UX Flow) Plug wired earphones into the AirPods case iPhone automatically detects: “Wired Earphones via AirPods Case” Seamless pairing using existing AirPods framework Audio controls, volume, and switching handled through iOS No additional apps required #Key Benefits Combines wired sound reliability with wireless convenience Reduces physical cable disturbance during use Extends usefulness of existing wired earphones Minimal learning curve for users Fits naturally into Apple’s ecosystem and design philosophy #Privacy & Performance Considerations On-device audio processing only No cloud involvement Low-latency audio using Apple’s proprietary Bluetooth codecs Power-efficient usage leveraging AirPods case battery #Target Users Users who prefer wired earphones but want wireless freedom Commuters and walkers Developers and professionals who multitask Users sensitive to Bluetooth earbud fit or comfort #Ecosystem Fit Builds on existing AirPods pairing and audio stack Aligns with Apple’s focus on seamless UX Could be implemented via: New AirPods hardware Firmware update + accessory Dedicated Apple audio bridge
Replies
1
Boosts
0
Views
337
Activity
Jan ’26
AVAudioEngine Voice Processing Fails with Mismatched Input/Output Devices: AggregateDevice Channel Count Mismatch
I'm encountering errors while using AVAudioEngine with voice processing enabled (setVoiceProcessingEnabled(true)) in scenarios where the input and output audio devices are not the same. This issue arises specifically with mismatched devices, preventing the application from functioning as expected. Works: Paired devices (e.g., MacBook Pro mic → MacBook Pro speakers) Fails: Mismatched devices (e.g., AirPods mic → MacBook Pro speakers) When using paired input and output devices: The setup works as expected. Example: MacBook Pro microphone → MacBook Pro speakers. When using mismatched devices: AVAudioEngine setup fails during aggregate device construction. Example: AirPods microphone → MacBook Pro speakers. Error logs indicate a channel count mismatch. Here are the partial logs. Due to the content limit, I cannot post the entire logs. AUVPAggregate.cpp:1000 client-side input and output formats do not match (err=-10875) AUVPAggregate.cpp:1036 err=-10875 AVAEInternal.h:109 [AVAudioEngineGraph.mm:1344:Initialize: (err = PerformCommand(*outputNode, kAUInitialize, NULL, 0)): error -10875 AggregateDevice.mm:329 Failed expectation of constructed aggregate (312): mInput.streamChannelCounts == inputStreamChannelCounts AggregateDevice.mm:331 Failed expectation of constructed aggregate (312): mInput.totalChannelCount == std::accumulate(inputStreamChannelCounts.begin(), inputStreamChannelCounts.end(), 0U) AggregateDevice.mm:182 error fetching default pair AggregateDevice.mm:329 Failed expectation of constructed aggregate (336): mInput.streamChannelCounts == inputStreamChannelCounts AggregateDevice.mm:331 Failed expectation of constructed aggregate (336): mInput.totalChannelCount == std::accumulate(inputStreamChannelCounts.begin(), inputStreamChannelCounts.end(), 0U) AUHAL.cpp:1782 ca_verify_noerr: [AudioDeviceSetProperty(mDeviceID, NULL, 0, isInput, kAudioDevicePropertyIOProcStreamUsage, theSize, theStreamUsage), 560227702] AudioHardware-mac-imp.cpp:3484 AudioDeviceSetProperty: no device with given ID AUHAL.cpp:1782 ca_verify_noerr: [AudioDeviceSetProperty(mDeviceID, NULL, 0, isInput, kAudioDevicePropertyIOProcStreamUsage, theSize, theStreamUsage), 560227702] AggregateDevice.mm:182 error fetching default pair AggregateDevice.mm:329 Failed expectation of constructed aggregate (348): mInput.streamChannelCounts == inputStreamChannelCounts AggregateDevice.mm:331 Failed expectation of constructed aggregate (348): mInput.totalChannelCount == std::accumulate(inputStreamChannelCounts.begin(), inputStreamChannelCounts.end(), 0U) Is it possible to use voice processing with different input/output devices? If yes, are there any specific configurations required to handle mismatched devices? How can we resolve channel count mismatch errors during aggregate device construction? Are there settings or API adjustments to enforce compatibility between input/output devices? Are there any workarounds or alternative approaches to achieve voice processing functionality with mismatched devices? For instance, can we force an intermediate channel configuration or downmix input/output formats?
Replies
0
Boosts
0
Views
360
Activity
Dec ’25
coreaudio-api mailing list search broken
Hello, The search functionality of the coreaudio-api mailing list archive has been broken for a very long time. Several of the lower-level audio APIs have only been discussed on this mailing list, making it critical for those of us maintaining old audio code. Steps to reproduce: Open https://lists.apple.com/archives/list/coreaudio-api@lists.apple.com/ in your web browser. Enter a search term in the "Search this list" field in the top-right corner of the page. The search will eventually time out with "502 Bad Gateway" Can somebody please forward this information to the current maintainer? I've tried to contact developer support but they weren't sure what to do. Thanks!
Replies
1
Boosts
0
Views
218
Activity
Feb ’26
Essentials of macOS to read and write mp3 and mp4 audio files
Hi, On macOS I used to open MP3 and MP4 files with ExtAudioFile. For a few years it doesn't work anymore. So I decided to try different macOS API using the AudioFileID of AudioToolbox framework. I decided to write a test: https://gist.github.com/joelkraehemann/7f5b241b52ca38c3a765c138fb647588 It fails right here: AudioFileOpenWithCallbacks() By telling OSStatus error 1954115647, which means kAudioFileUnsupportedFileTypeError. The filename was set to an MP4 file: ~/Music/test.mp4 Howto fix this? regards, Joël
Replies
1
Boosts
0
Views
648
Activity
Jun ’25
Live Translations on VOIP on iOS26
Hi team, With regards to Call (Live) Translations on VOIP: Is it possible to invoke live translations within the app? (without going into the Call System UI) Is it possible to navigate users from app to Call System UI via an API? (and also invoking the new live translations directly) Will Apple support more languages apart from the current ones? (Currently I see 4 supported languages)
Replies
1
Boosts
0
Views
180
Activity
Aug ’25
Question about Apple Vision Pro audio input sampling rate for research
I am a graduate student conducting research in speech/audio signal processing and multimodal interaction. Apple Vision Pro is widely recognized as a multimodal interactive system supporting voice, eye, and gesture inputs. However, I could not find detailed specifications or documentation about the audio input sampling rate used by the device’s built-in microphone array when capturing user audio. Specifically, I would like to understand: What is the default audio input sampling rate (e.g., 16 kHz, 44.1 kHz, 48 kHz, etc.) for the Vision Pro’s microphones? When developing with visionOS / AVAudioSession / AVAudioEngine, is there a documented or recommended sampling rate for audio capture? Are there any best practices or settings for enabling high-quality voice capture on Vision Pro (especially for voice research tasks)? For context, my work involves voice processing, analysis, and possibly on-device real-time speech recognition. Any pointers to relevant APIs, documentation or examples (especially regarding audio capture buffer size or available formats on visionOS) would be very helpful. Thank you in advance! Best regards.
Replies
0
Boosts
0
Views
202
Activity
Jan ’26
iOS 18 CarPlay: “There was a problem loading this content” error after playback
In iOS 18, CarPlay shows an error: “There was a problem loading this content” after playback starts. Audio works fine, but the Now Playing screen doesn’t load. I’m using MPPlayableContentManager. This worked fine in iOS 17. Anyone else seeing this error in iOS 18?
Replies
0
Boosts
0
Views
123
Activity
May ’25
how to use this api:AVAudioConverter?
I neet to take pcm data from aac data, but this api has fossy me deeply.
Replies
1
Boosts
0
Views
354
Activity
Jan ’26