Audio

How to record voice, auto-transcribe, translate (auto-detect input language), and play back translated audio on same device in iOS Swift?

Hi everyone 👋 I’m building an iOS app in Swift where I want to do the following: Record the user’s voice Transcribe the spoken sentence (speech-to-text) Auto-detect the spoken language Translate it to another language selected by the user (e.g., English → Spanish or Hindi → English) Speak back (text-to-speech) the translated text on the same device Is this possible to record via phone mic and play the transcribe voice into headphone's audio?

Media Technologies Audio Speech Siri and Voice Localization Live Text

0

228

Oct ’25

macOS Tahoe: Can't setup AVAudioEngine with playthrough

Hi, I'm trying to setup a AVAudioEngine for USB Audio recording and monitoring playthrough. As soon as I try to setup playthough I get an error in the console: AVAEInternal.h:83 required condition is false: [AVAudioEngineGraph.mm:1361:Initialize: (IsFormatSampleRateAndChannelCountValid(outputHWFormat))] Any ideas how to fix it? // Input-Device setzen try? setupInputDevice(deviceID: inputDevice) let input = audioEngine.inputNode // Stereo-Format erzwingen let inputHWFormat = input.inputFormat(forBus: 0) let stereoFormat = AVAudioFormat(commonFormat: inputHWFormat.commonFormat, sampleRate: inputHWFormat.sampleRate, channels: 2, interleaved: inputHWFormat.isInterleaved) guard let format = stereoFormat else { throw AudioError.deviceSetupFailed(-1) } print("Input format: \(inputHWFormat)") print("Forced stereo format: \(format)") audioEngine.attach(monitorMixer) audioEngine.connect(input, to: monitorMixer, format: format) // MonitorMixer -> MainMixer (Output) // Problem here, format: format also breaks. audioEngine.connect(monitorMixer, to: audioEngine.mainMixerNode, format: nil)

Media Technologies Audio macOS Audio AVAudioNode

0

174

Oct ’25

Mac OS Tahoe 26.0 (25A354) Sound Glitches When opening the simulator app

Hey there, I just upgraded to Mac OS Tahoe ,son an apple MacBook Pro 2019 16inch. am using IntellijIDEA and Flutter to develop a mobile app which I test on the simulator app running iOS 18.4 . the issue: when I start the simulator app. ( while in the loading phase and in the operation phase as well ), the audio from an already open YouTube tab on safari (this happens on chrome browser as well). the sound glitches and becomes Noise. a fix I found online is to kill the audio deamon on Mac OS, This works using the command: "sudo killall coreaudiod" this kills the audio process, (while the emulator is operational), then the macOS restarts the audio deamon then the audio works fine alongside with the simulator being open. I just want to ask is there a permanent fix for this? is Apple working on a fix for this in the upcoming update?

Media Technologies Audio Simulator

3

5

1.2k

Oct ’25

Handling AVAudioEngine Configuration Change

Hi all, I have been quite stumped on this behavior for a little bit now, so thought it best to share here and see if someone more experience with AVAudioEngine / AVAudioSession can weigh in. Right now I have a AVAudioEngine that I am using to perform some voice chat with and give buffers to play. This works perfectly until route changes start to occur, which causes the AVAudioEngine to reset itself, which then causes all players attached to this engine to be stopped. Once a AVPlayerNode gets stopped due to this (but also any other time), all samples that were scheduled to be played then get purged. Where this becomes confusing for me is the completion handler gets called every time regardless of the sound actually being played. Is there a reliable way to know if a sample needs to be rescheduled after a player has been reset? I am not quite sure in my case what my observer of AVAudioEngineConfigurationChange needs to be doing, as this engine only handles output. All input is through a separate engine for simplicity. Currently I am storing a queue of samples as they get sent to the AVPlayerNode for playback, and after that completion checking if the player isPlaying or not. If it's playing I assume that the sound actually was played- and if not then I leave it in the queue and assume that an observer on the route change or the configuration change will realize there are samples in the queue and reset them Thanks for any feedback!

Media Technologies Audio Audio AVAudioSession AVAudioEngine AVFoundation

3

0

700

Oct ’25

Is AVAudioPCMFormatFloat32 required for playing a buffer with AVAudioEngine / AVAudioPlayerNode

I have a PCM audio buffer (AVAudioPCMFormatInt16). When I try to play it using AVPlayerNode / AVAudioEngine an exception is thrown: "[[busArray objectAtIndexedSubscript:(NSUInteger)element] setFormat:format error:&nsErr]: returned false, error Error Domain=NSOSStatusErrorDomain Code=-10868 (related thread https://forums.developer.apple.com/forums/thread/700497?answerId=780530022#780530022) If I convert the buffer to AVAudioPCMFormatFloat32 playback works. My questions are: Does AVAudioEngine / AVPlayerNode require AVAudioPCMBuffer to be in the Float32 format? Is there a way I can configure it to accept another format instead for my application? If 1 is YES is this documented anywhere? If 1 is YES is this required format subject to change at any point? Thanks! I was looking to watch the "AVAudioEngine in Practice" session video from WWDC 2014 but I can't find it anywhere (https://forums.developer.apple.com/forums/thread/747008).

Media Technologies Audio AVAudioNode AVAudioEngine AVFoundation

1

0

1k

Oct ’25

When to set AVAudioSession's preferredInput?

I want the audio session to always use the built-in microphone. However, when using the setPreferredInput() method like in this example private func enableBuiltInMic() { // Get the shared audio session. let session = AVAudioSession.sharedInstance() // Find the built-in microphone input. guard let availableInputs = session.availableInputs, let builtInMicInput = availableInputs.first(where: { $0.portType == .builtInMic }) else { print("The device must have a built-in microphone.") return } // Make the built-in microphone input the preferred input. do { try session.setPreferredInput(builtInMicInput) } catch { print("Unable to set the built-in mic as the preferred input.") } } and calling that function once in the initializer, the audio session still switches to the external microphone once one is plugged in. The session's preferredInput is nil again at that point, even if the built-in microphone is still listed in availableInputs. So, why is the preferredInput suddenly reset? when would be the appropriate time to set the preferredInput again? Observing the session’s availableInputs did not work and setting the preferredInput again in the routeChangeNotification handler seems a bad choice as it’s already a bit too late then.

Media Technologies Audio AVAudioSession

1

0

837

Oct ’25

AVPlayerView with .inline controlsStyle macOS 26

My audio app shows a control bar at the bottom of the window. The controls show nicely, but there is a black "slab" appearing behind the inline controls, the same size as the playerView. Setting the player view background color does nothing: playerView.wantsLayer = true playerView.layer?.backgroundColor = NSColor.clear.cgColor How can I clear the background? If I use .floating controlsStyle, I don't get the background "slab".

Media Technologies Audio

0

158

Oct ’25

AirPods with H2 and studio-quality recording - how to replicate Camera video capture

Using an iPhone Pro 12 running iOS 26.0.1, with AirPods Pro 3. Camera app does capture video with what seems to be "Studio Quality Recording". Am trying to replicate that SQR with my own Camera like app, and while I can pull audio in from the APP3 mic, and my video capture app is recording a 48,000Hz high-bitrate video, the audio still sounds non-SQR. I'm seeing bluetoothA2DP , bluetoothLE , bluetoothHFP as portType, and not sure if SQR depends on one of those? Is there sample code demonstrating a SQR capture? Nevermind video and camera, just audio even? Also, I don't understand what SQR is doing between the APP3 and the iPhone. What codec is that? What bitrate is that? If I capture video using Capture and inspect the audio stream I see mono 74.14 kbit/s MPEG-4 AAC, 48000 Hz. But I assume that's been recompressed and not really giving me any insight into the APP3 H2 transmission?

Media Technologies Audio IOBluetooth Core Bluetooth

1

0

133

Oct ’25

Failure on attempt to import track as spatial audio

I'm working on a project to support spatial audio editing, using this sample project as a reference: https://developer.apple.com/documentation/Cinematic/editing-spatial-audio-with-an-audio-mix This sample works well on an unedited capture, but does not work for a capture that has already been edited. The failure is occurring at "let audioInfo = try await CNAssetSpatialAudioInfo(asset: myAsset)", which is throwing "no eligible audio tracks in asset". I also find that for already edited captures, if i use CNAssetSpatialAudioInfo.assetContainsSpatialAudio, it returns false. What i mean by "already edited" is that if I take a spatial capture with my iPhone 16, and then edit that capture in the Photos app using the Cinematic effect, and then save the edited output (e.g. edited_capture.mov), I can't import that edited_capture.mov into my project as a spatial audio asset. Is this intentional behavior or a bug? If it's intentional, can you describe why?

Media Technologies Audio

0

1

158

Sep ’25

Lock screen media controls for MusicKit/ ApplicationMusicPlayer

Hi, when using ApplicationMusicPlayer from MusicKit my app automatically gets the media controls on the lock screen: Play/ Pause, Skip Buttons, Playback Position etc. I would like to customize these. Tried a bunch of things, e.g. using MPRemoteCommandCenter. So far I haven't had any success. Does anyone know how I can customize the media controls of ApplicationMusicPlayer. Thank you.

Media Technologies Audio MusicKit

2

0

502

Sep ’25

[iOS 26 bug] AVInputPickerInteraction selection immediately reverts on iOS 26

Hello everyone, I'm implementing the new AVInputPickerInteraction API on iOS 26 to allow users to select their microphone from a custom settings menu before recording. The implementation seems correct, but I'm encountering a strange issue where the input selection immediately reverts to the previous device. The Situation: The picker is presented correctly via a manual call to .present(). I can see all available inputs (e.g., "iPhone Microphone" and "AirPods"). The current input is "iPhone Microphone". I tap on "AirPods". The UI updates to show "AirPods" as selected for a fraction of a second, then immediately jumps back to "iPhone Microphone". The same thing happens in reverse. It seems like the system is automatically reverting the audio route change requested by the picker. My Implementation: My setup follows the standard pattern discussed in the WWDC sessions. Setup Code: This setup is performed once before the user can trigger the picker. @available(iOS 26.0, *) var inputPickerInteraction: AVInputPickerInteraction? // Note: The AVAudioSession is configured to .playAndRecord // and set to active elsewhere in the code before this setup is called. if #available(iOS 26.0, *) { // Setup the picker let picker = AVInputPickerInteraction() self.inputPickerInteraction = picker self.view.addInteraction(picker) // Added to establish context } Presentation Code: When a user selects "Change Input" from my custom settings menu, I call .present() on the main thread. // In a delegate method from a custom menu if #available(iOS 26.0, *) { DispatchQueue.main.async { self.inputPickerInteraction?.present(animated: true) } } What I've already checked: The AVAudioSession is active and its category is .playAndRecord. The inputPickerInteraction object is not nil. The .present() method is being called on the main thread. The picker is added to a view using view.addInteraction() in the setup phase. I've reviewed my code to ensure there is no other logic that could be manually resetting the AVAudioSession's preferred input. Has anyone else experienced this behavior? I suspect this might be a bug in the new API, but I want to make sure I'm not missing a crucial step in managing the AVAudioSession state. Any insights or potential workarounds would be greatly appreciated. Thank you.

Media Technologies Audio

2

0

223

Sep ’25

Windows Apple Music: how to enumerate the local library or export it? Is Library.musicdb documented / API available?

Environment Windows 11 [edition/build]: [e.g., 23H2, 22631.x] Apple Music for Windows version: [e.g., 1.x.x from Microsoft Store] Library folder: C:\Users<user>\Music\Apple Music\Apple Music Library.musiclibrary Summary I need a supported way to programmatically enumerate the local Apple Music library on Windows (track file paths, playlists, etc.) for reconciliation with the on-disk Media folder. On macOS this used to be straightforward via scripting/export; on Windows I can’t find an equivalent. What I’m seeing in the library bundle Library.musicdb → not SQLite. First 4 bytes: 68 66 6D 61 ("hfma"). Library Preferences.musicdb → also starts with "hfma". artwork.sqlite → SQLite but appears to be artwork cache only (no track file paths). Extras.itdb → has SQLite format 3 header but (from a quick scan) not seeing track locations. Genius.itdb → not a SQLite database on this machine. What I’ve tried Attempted to open Library.musicdb with SQLite providers → error: “file is not a database.” Binary/string scans (ASCII, UTF-16LE/BE, null-stripped) of Library.musicdb → did not reveal file paths or obvious plist/XML/JSON blobs. The Windows Apple Music UI doesn’t appear to expose “Export Library / Export Playlist” like legacy iTunes did, and I can’t find a public API for local library enumeration on Windows. What I’m trying to accomplish Read local track entries (absolute or relative paths), detect broken links, and reconcile against the Media folder. A read-only solution is fine; I do not need to modify the library. Questions for Apple Is the Library.musicdb file format documented anywhere, or is there a supported SDK/API to enumerate the local library on Windows? Is there a supported export mechanism (CLI, UI, or API) on Windows Apple Music to dump the local library and/or playlists (XML/CSV/JSON)? Is there a Windows-specific equivalent to the old iTunes COM automation or any MusicKit surface that can return local library items (not streaming catalog) and their file locations? If none of the above exist today, is there a recommended workaround from Apple for library reconciliation on Windows (e.g., documented support for importing M3U/M3U8 to rebuild the local library from disk)? Are there any plans/timeline for adding Windows feature parity with iTunes/Music on macOS for exporting or scripting the local library? Why this matters For large personal libraries, users occasionally end up with orphaned files on disk or broken links in the app. Without an export or API, it’s difficult to audit and fix at scale on Windows. Reference details (in case it helps triage) Library.musicdb header bytes: 68-66-6D-61-A0-00-00-00-10-26-34-00-15-00-01-00 (ASCII shows hfma…). artwork.sqlite is readable but doesn’t contain track file paths (appears limited to artwork). I can supply a minimal repro tool and logs if that’s helpful. Feature request (if no current API) Add an official Export Library/Playlists action on Windows Apple Music, or Provide a read-only Windows API (or schema doc) that surfaces track file locations and playlist membership from the local library. Thanks in advance for any guidance or pointers to docs I might have missed.

Media Technologies Audio Design Media Library Media

0

200

Sep ’25

Destroy MIDIUMPMutableEndpoint again?

Is there a way to destroy MIDIUMPMutableEndpoint again? In my app, the user has a setting to enable and disable MIDI 2.0. If MIDI 2.0 should not be supported (or if iOS version < 18), it creates a virtual destination and a virtual source. And if MIDI 2.0 should be enabled, it instead creates a MIDIUMPMutableEndpoint, which itself creates the virtual destination and source automatically. So here is my problem: I didn't find any way to destroy the MIDIUMPMutableEndpoint again. There is a method to disable it (setEnabled:NO), but that doesn't destroy or hide the virtual destination and source. So when the user turns MIDI 2.0 support off, I will have two virtual destinations and sources, and cannot get rid of the 2.0 ones. What is the correct way to get rid of the MIDIUMPMutableEndpoint once it is created?

Media Technologies Audio Core MIDI

0

95

Sep ’25

Improving Speech Analyzer Transcription for technical terms

I am developing an app with transcription and I am exploring ways to improve the transcription from the SpeechAnalyzer/Transcriber for technical terms. SFSpeech... recognition had the capability of being augmented by contextualStrings. Does something similar exist for SpeechAnalyzer/Transcriber? If so please point me towards the documentation and any sample code that may exist for this. If there are other options, please let me know.

Media Technologies Audio Speech

1

255

Sep ’25

Convert CoreAudio AudioObjectID to IOUSB LocationID

Is there a recommended way on macOS 26 Tahoe to take a CoreAudio AudioObjectID and use it to lookup the underlying USB LocationID? I previously used AudioObjectID to query the corresponding DeviceUID with kAudioDevicePropertyDeviceUID. Then I queried for the IOService matching kIOAudioEngineClassName with property kIOAudioEngineGlobalUniqueIDKey matching DeviceUID, and I loaded kUSBDevicePropertyLocationID from the result. This fails on macOS 26, because the IO Registry for the device has an entry for usbaudiod rather than AppleUSBAudioEngine, and usbaudiod does not include a kIOAudioEngineGlobalUniqueIDKey property (or any other property to map it to a CoreAudio DeviceUID). My use-case here is a piece of audio recording software that allows configuring a set of supported audio devices via USB HID prior to recording. I present the user with a list of CoreAudio devices to use, but without a way to lookup the underlying USB LocationID, I cannot guarantee that the configured device matches the selected device (e.g. if the user plugged in two identical microphones).

Media Technologies Audio Core Audio

2

0

537

Sep ’25

AudioQueue Output fails playing audio almost immediately?

On macOS Sequoia, I'm having the hardest time getting this basic audio output to work correctly. I'm compiling in XCode using C99, and when I run this, I get audio for a split second, and then nothing, indefinitely. Any ideas what could be going wrong? Here's a minimum code example to demonstrate: #include <AudioToolbox/AudioToolbox.h> #include <stdint.h> #define RENDER_BUFFER_COUNT 2 #define RENDER_FRAMES_PER_BUFFER 128 // mono linear PCM audio data at 48kHz #define RENDER_SAMPLE_RATE 48000 #define RENDER_CHANNEL_COUNT 1 #define RENDER_BUFFER_BYTE_COUNT (RENDER_FRAMES_PER_BUFFER * RENDER_CHANNEL_COUNT * sizeof(f32)) void RenderAudioSaw(float* outBuffer, uint32_t frameCount, uint32_t channelCount) { static bool isInverted = false; float scalar = isInverted ? -1.f : 1.f; for (uint32_t frame = 0; frame < frameCount; ++frame) { for (uint32_t channel = 0; channel < channelCount; ++channel) { // series of ramps, alternating up and down. outBuffer[frame * channelCount + channel] = 0.1f * scalar * ((float)frame / frameCount); } } isInverted = !isInverted; } AudioStreamBasicDescription coreAudioDesc = { 0 }; AudioQueueRef coreAudioQueue = NULL; AudioQueueBufferRef coreAudioBuffers[RENDER_BUFFER_COUNT] = { NULL }; void coreAudioCallback(void* unused, AudioQueueRef queue, AudioQueueBufferRef buffer) { // 0's here indicate no fancy packet magic AudioQueueEnqueueBuffer(queue, buffer, 0, 0); } int main(void) { const UInt32 BytesPerSample = sizeof(float); coreAudioDesc.mSampleRate = RENDER_SAMPLE_RATE; coreAudioDesc.mFormatID = kAudioFormatLinearPCM; coreAudioDesc.mFormatFlags = kLinearPCMFormatFlagIsFloat | kLinearPCMFormatFlagIsPacked; coreAudioDesc.mBytesPerPacket = RENDER_CHANNEL_COUNT * BytesPerSample; coreAudioDesc.mFramesPerPacket = 1; coreAudioDesc.mBytesPerFrame = RENDER_CHANNEL_COUNT * BytesPerSample; coreAudioDesc.mChannelsPerFrame = RENDER_CHANNEL_COUNT; coreAudioDesc.mBitsPerChannel = BytesPerSample * 8; coreAudioQueue = NULL; OSStatus result; // most of the 0 and NULL params here are for compressed sound formats etc. result = AudioQueueNewOutput(&coreAudioDesc, &coreAudioCallback, NULL, 0, 0, 0, &coreAudioQueue); if (result != noErr) { assert(false == "AudioQueueNewOutput failed!"); abort(); } for (int i = 0; i < RENDER_BUFFER_COUNT; ++i) { uint32_t bufferSize = coreAudioDesc.mBytesPerFrame * RENDER_FRAMES_PER_BUFFER; result = AudioQueueAllocateBuffer(coreAudioQueue, bufferSize, &(coreAudioBuffers[i])); if (result != noErr) { assert(false == "AudioQueueAllocateBuffer failed!"); abort(); } } for (int i = 0; i < RENDER_BUFFER_COUNT; ++i) { RenderAudioSaw(coreAudioBuffers[i]->mAudioData, RENDER_FRAMES_PER_BUFFER, RENDER_CHANNEL_COUNT); coreAudioBuffers[i]->mAudioDataByteSize = coreAudioBuffers[i]->mAudioDataBytesCapacity; AudioQueueEnqueueBuffer(coreAudioQueue, coreAudioBuffers[i], 0, 0); } AudioQueueStart(coreAudioQueue, NULL); sleep(10); // some time to hear the audio AudioQueueStop(coreAudioQueue, true); AudioQueueDispose(coreAudioQueue, true); return 0; }

Media Technologies Audio AudioToolbox Core Audio AVFoundation

2

0

444

Sep ’25

AVAudioSession.outputVolume not reporting correctly in iOS 18+ devices

I’m using the shared instance of AVAudioSession. After activating it with .setActive(true), I observe the outputVolume, and it correctly reports the device’s volume. However, after deactivating the session using .setActive(false), changing the volume, and then reactivating it again, the outputVolume returns the previous volume (before deactivation), not the current device volume. The correct volume is only reported after the user manually changes it again using physical buttons or Control Center, which triggers the observer. What I need is a way to retrieve the actual current device volume immediately after reactivating the audio session, even on the second and subsequent activations. Disabling and re-enabling the audio session is essential to how my application functions. I’ve tested this behavior with my colleagues, and the issue is consistently reproducible on iOS 18.0.1, iOS 18.1, iOS 18.3, iOS 18.5 and iOS 18.6.2. On devices running iOS 17.6.1 and iOS 16.0.3, outputVolume correctly reflects the current volume immediately after calling .setActive(true) multiple times.

Media Technologies Audio Audio AVAudioSession

2

0

197

Sep ’25

dlsym cannot find symbol g_dwILResult when debugging an audio plugin

I am trying to debug the AAX version of my plugin (MIDI effect) on Pro Tools, but I am getting the following error (Mac console) when attempting to load it: dlsym cannot find symbol g_dwILResult in CFBundle etc.. I used Xcode 16.4 to build the plugin. Has anybody come across the same or a similar message? Best, Achillefs Axart Labs

Media Technologies Audio Debugging Audio Linker Core Audio

2

0

511

Sep ’25

Play Audio and Recognize Speech in Car

Hello, I'm trying to determine the best/recommended AVAudioSession configuration (i.e category, mode, and options) for the following use-case. Essentially, I'd like to switch between periods of playing an audio file and then recognizing speech. The audio file is typically speech and I don't intend for playback and speech recognition to occur simultaneously. I'd like for the user to sill be able to interact with Siri and I'd like for it to work with CarPlay where navigation prompts can occur. I would assume the category to use is 'playAndRecord', but I'm not sure if it's better to just set that once for the entire lifecycle, or set to 'playback' for audio file playback and then switch to 'playAndRecord' for speech recognition . I'm also not sure on the best 'mode' and 'options' to set. Any suggestions would be appreciated. Thanks.

Media Technologies Audio CarPlay Speech AVAudioSession

0

519

Sep ’25

How to get PID from AudioObjectID on macOS pre Sonoma

3 I am working on an application to get when input audio device is being used. Basically I want to know the application using the microphone (built-in or external) This app runs on macOS. For Mac versions starting from Sonoma I can use this code: int getAudioProcessPID(AudioObjectID process) { pid_t pid; if (@available(macOS 14.0, *)) { constexpr AudioObjectPropertyAddress prop { kAudioProcessPropertyPID, kAudioObjectPropertyScopeGlobal, kAudioObjectPropertyElementMain }; UInt32 dataSize = sizeof(pid); OSStatus error = AudioObjectGetPropertyData(process, &prop, 0, nullptr, &dataSize, &pid); if (error != noErr) { return -1; } } else { // Pre sonoma code goes here } return pid; } which works. However, kAudioProcessPropertyPID was added in macOS SDK 14.0. Does anyone know how to achieve the same functionality on previous versions?

Media Technologies Audio macOS Objective-C Core Audio

1

0

289

Sep ’25

Post

Replies

Boosts

Views

Activity