Dive into the technical aspects of audio on your device, including codecs, format support, and customization options.

Audio Documentation

Posts under Audio subtopic

Post

Replies

Boosts

Views

Activity

Is Call Translation API available for VOIP?
I might have misunderstood the docs, but is Call Translation going to be available for VOIP applications? Eg in an already connected VOIP call, would it be possible for Call Translations to be enabled on an iOS 26 and Apple Intelligence supported device? I have personally tried it and it doesn’t look like it supported VOIP but would love to confirm this. reference: https://developer.apple.com/documentation/callkit/cxsettranslatingcallaction/
1
0
84
Jun ’25
Convert CoreAudio AudioObjectID to IOUSB LocationID
Is there a recommended way on macOS 26 Tahoe to take a CoreAudio AudioObjectID and use it to lookup the underlying USB LocationID? I previously used AudioObjectID to query the corresponding DeviceUID with kAudioDevicePropertyDeviceUID. Then I queried for the IOService matching kIOAudioEngineClassName with property kIOAudioEngineGlobalUniqueIDKey matching DeviceUID, and I loaded kUSBDevicePropertyLocationID from the result. This fails on macOS 26, because the IO Registry for the device has an entry for usbaudiod rather than AppleUSBAudioEngine, and usbaudiod does not include a kIOAudioEngineGlobalUniqueIDKey property (or any other property to map it to a CoreAudio DeviceUID). My use-case here is a piece of audio recording software that allows configuring a set of supported audio devices via USB HID prior to recording. I present the user with a list of CoreAudio devices to use, but without a way to lookup the underlying USB LocationID, I cannot guarantee that the configured device matches the selected device (e.g. if the user plugged in two identical microphones).
2
0
695
Sep ’25
[26] audioTimeRange would still be interesting for .volatileResults in SpeechTranscriber
So experimenting with the new SpeechTranscriber, if I do: let transcriber = SpeechTranscriber( locale: locale, transcriptionOptions: [], reportingOptions: [.volatileResults], attributeOptions: [.audioTimeRange] ) only the final result has audio time ranges, not the volatile results. Is this a performance consideration? If there is no performance problem, it would be nice to have the option to also get speech time ranges for volatile responses. I'm not presenting the volatile text at all in the UI, I was just trying to keep statistics about the non-speech and the speech noise level, this way I can determine when the noise level falls under the noisefloor for a while. The goal here was to finalize the recording automatically, when the noise level indicate that the user has finished speaking.
6
0
791
Nov ’25
ShazamKit supported for iOS apps that can run on Mac silicon?
I am having issues deploying my iOS app, that uses ShazamKit, to get working on a Mac with Apple silicon. When uploading the archive to App Store Connect I do get ITMS-90863: Macs with Apple silicon support issue - The app links with libraries that aren’t present in macOS: /usr/lib/swift/libswiftShazamKit.dylib Is ShazamKit not supported for iOS apps that can run on Macs with Apple silicon? Or is there something I should fix in my setup / deployment?
26
0
1.2k
Jun ’25
AVB Support for the AVnu MILAN Conventions
The AVB AVnu MILAN Convention has a groweing Population. Many big companies (Cisco, Meyer Sound, d&b Audio, l‘acoustics, Presonus, digico etc.) implements the AVB AVnu Milan Standards. Is there a plan on the Apple side to also implement AVnu Milan on top of the AVB Protocol? The advantage for Apple Sound would be a great Integration in the professionell Audio market and a more stable intergration on top of the AVB protocol. The atdecc work, but Not that stable.
1
0
185
Oct ’25
Is AVAudioPCMFormatFloat32 required for playing a buffer with AVAudioEngine / AVAudioPlayerNode
I have a PCM audio buffer (AVAudioPCMFormatInt16). When I try to play it using AVPlayerNode / AVAudioEngine an exception is thrown: "[[busArray objectAtIndexedSubscript:(NSUInteger)element] setFormat:format error:&nsErr]: returned false, error Error Domain=NSOSStatusErrorDomain Code=-10868 (related thread https://forums.developer.apple.com/forums/thread/700497?answerId=780530022#780530022) If I convert the buffer to AVAudioPCMFormatFloat32 playback works. My questions are: Does AVAudioEngine / AVPlayerNode require AVAudioPCMBuffer to be in the Float32 format? Is there a way I can configure it to accept another format instead for my application? If 1 is YES is this documented anywhere? If 1 is YES is this required format subject to change at any point? Thanks! I was looking to watch the "AVAudioEngine in Practice" session video from WWDC 2014 but I can't find it anywhere (https://forums.developer.apple.com/forums/thread/747008).
1
0
1.1k
Oct ’25
How to disable/hide Audio Controls on lock screen from WkWebView
Hi, I am trying to remove the audio controls for my app on the lock screen. Since I use WKWebView, there are 3 audio tags in my html and I play and pause em via JS. However, if I do not play any sound since app launch, there are no audio controls on the lock screen. But if I play one of those 3 files (they are even less then 3 Sec sound effects e.g. for buttons) the audio controls appears on lock screen. Note even when the sounds on pause() or not playing they were listed on the lock screen. What I have tried so far without success MPNowPlayingInfoCenter.default().nowPlayingInfo = [:] and ``try audioSession.setCategory(.playback, mode: .default, options: []) try audioSession.setActive(false, options: .notifyOthersOnDeactivation)`` and UIApplication.shared.endReceivingRemoteControlEvents() Another problem is that the app scales with iOS system settings "display zoom". Is there a way to deny it? It is latest Xcode verion 16.3 and iOS 18. I have no background mode in my Capabilities. Nothing worked so far. Has anyone an idea? Greetings
2
0
152
May ’25
SpeechTranscriber/SpeechAnalyzer being relatively slow compared to FoundationModel and TTS
So, I've been wondering how fast a an offline STT -> ML Prompt -> TTS roundtrip would be. Interestingly, for many tests, the SpeechTranscriber (STT) takes the bulk of the time, compared to generating a FoundationModel response and creating the Audio using TTS. E.g. InteractionStatistics: - listeningStarted: 21:24:23 4480 2423 - timeTillFirstAboveNoiseFloor: 01.794 - timeTillLastNoiseAboveFloor: 02.383 - timeTillFirstSpeechDetected: 02.399 - timeTillTranscriptFinalized: 04.510 - timeTillFirstMLModelResponse: 04.938 - timeTillMLModelResponse: 05.379 - timeTillTTSStarted: 04.962 - timeTillTTSFinished: 11.016 - speechLength: 06.054 - timeToResponse: 02.578 - transcript: This is a test. - mlModelResponse: Sure! I'm ready to help with your test. What do you need help with? Here, between my audio input ending and the Text-2-Speech starting top play (using AVSpeechUtterance) the total response time was 2.5s. Of that time, it took the SpeechAnalyzer 2.1s to get the transcript finalized, FoundationModel only took 0.4s to respond (and TTS started playing nearly instantly). I'm already using reportingOptions: [.volatileResults, .fastResults] so it's probably as fast as possible right now? I'm just surprised the STT takes so much longer compared to the other parts (all being CoreML based, aren't they?)
3
0
810
Mar ’26
iOS 26 HLS Audio Track Display Behavior: EXT-X-MEDIA NAME vs LANGUAGE Attributes
Hello Apple Developer Community, I am seeking clarification on the intended display behavior of HLS audio tracks within the iOS 26 (or current beta) native player, specifically concerning the NAME and LANGUAGE attributes of the EXT-X-MEDIA tag. In our HLS manifests, we define alternative audio tracks using EXT-X-MEDIA tags, like so: #EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",LANGUAGE="ja",NAME="AUDIO-1",DEFAULT=YES,AUTOSELECT=YES,URI="audio_ja.m3u8" #EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",LANGUAGE="ja",NAME="AUDIO-2",URI="audio_en.m3u8" Our observation is that when an audio track is selected and its name is displayed in the native iOS media controls (e.g., Control Center or within a full-screen video player's UI), the value specified in the NAME attribute ("AUDIO-1", "AUDIO-2") does not seem to be used. Instead, the display appears to derive from the LANGUAGE attribute ("ja", "en"), often showing the system's localized string for that language (e.g., "Japanese", "English"). We would like to understand the official or intended behavior regarding this. Is it the expected behavior for the iOS native player to prioritize the LANGUAGE attribute (or its localized equivalent) over the NAME attribute for displaying the selected audio track's label? If this is the intended design, what is the recommended best practice for developers who wish to present a custom, human-readable name for audio tracks (beyond the standard language name) in the native iOS UI? Are there any specific AVPlayer properties or AVMediaSelectionOption considerations that would allow more granular control over this display, or is this entirely managed by the system based on the LANGUAGE attribute? Any insights or official guidance on this behavior in iOS 26 (and potentially previous versions) would be greatly appreciated. Thank you for your time and assistance.
2
0
446
Aug ’25
ScreenCaptureKit System Audio Capture Crashes with EXC_BAD_ACCESS
Bug Report: ScreenCaptureKit System Audio Capture Crashes with EXC_BAD_ACCESS Summary When using ScreenCaptureKit to capture system audio for extended periods, the application crashes with EXC_BAD_ACCESS in Swift's error handling runtime. The crash occurs in swift_getErrorValue when trying to process an error from the SCStream delegate method didStopWithError. This appears to be a framework-level issue in ScreenCaptureKit or its underlying ReplayKit implementation. Environment macOS Sonoma 14.6.1 Swift 5.8 ScreenCaptureKit framework Detailed Description Our application captures system audio using ScreenCaptureKit's audio capture capabilities. After successfully capturing for several minutes (typically after 3-4 segments of 60-second recordings), the application crashes with an EXC_BAD_ACCESS error. The crash happens when the Swift runtime attempts to process an error in the SCStreamDelegate.stream(_:didStopWithError:) method. The crash consistently occurs in swift_getErrorValue when attempting to access the class of what appears to be a null object. This suggests that the error being passed from the system framework to our delegate method is malformed or contains invalid memory. Steps to Reproduce Create an SCStream with audio capture enabled Add audio output to the stream Start capture and write audio data to disk Allow the capture to run for several minutes (3-5 minutes typically triggers the issue) The app will crash with EXC_BAD_ACCESS in swift_getErrorValue Code Sample func stream(_ stream: SCStream, didStopWithError error: Error) { print("Stream stopped with error: \(error)") // Crash occurs before this line executes } func stream(_ stream: SCStream, didOutputSampleBuffer sampleBuffer: CMSampleBuffer, of type: SCStreamOutputType) { guard type == .audio, sampleBuffer.isValid else { return } // Process audio data... } Expected Behavior The error should be properly propagated to the delegate method, allowing for graceful error handling and recovery. Actual Behavior The application crashes with EXC_BAD_ACCESS when the Swift runtime attempts to process the error in swift_getErrorValue. Crash Log Details Thread #35, queue = 'com.apple.NSXPCConnection.m-user.com.apple.replayd', stop reason = EXC_BAD_ACCESS (code=1, address=0x0) frame #0: 0x0000000194c3088c libswiftCore.dylib`swift::_swift_getClass(void const*) + 8 frame #1: 0x0000000194c30104 libswiftCore.dylib`swift_getErrorValue + 40 frame #2: 0x00000001057fba30 shadow`NewScreenCaptureService.stream(stream=0x0000600002de6700, error=Swift.Error @ 0x000000016b7b5e30) at NEW+ScreenCaptureService.swift:365:15 frame #3: 0x00000001057fc050 shadow`@objc NewScreenCaptureService.stream(_:didStopWithError:) at <compiler-generated>:0 frame #4: 0x0000000219ec5ca0 ScreenCaptureKit`-[SCStreamManager stream:didStopWithError:] + 456 frame #5: 0x00000001ca68a5cc ReplayKit`-[RPScreenRecorder stream:didStopWithError:] + 84 frame #6: 0x00000001ca696ff8 ReplayKit`-[RPDaemonProxy stream:didStopWithError:] + 224 Printing description of stream._streamQueue: error: ObjectiveC.id:4294967281:18: note: 'id' has been explicitly marked unavailable here public typealias id = AnyObject ^ error: /var/folders/v4/3xg1hmp93gjd8_xlzmryf_wm0000gn/T/expr23-dfa421..cpp:1:65: 'id' is unavailable in Swift: 'id' is not available in Swift; use 'Any' Swift._DebuggerSupport.stringForPrintObject(Swift.UnsafePointer<id>(bitPattern: 0x104ae08c0)!.pointee) ^~ ObjectiveC.id:2:18: note: 'id' has been explicitly marked unavailable here public typealias id = AnyObject ^ warning: /var/folders/v4/3xg1hmp93gjd8_xlzmryf_wm0000gn/T/expr23-dfa421..cpp:5:7: initialization of variable '$__lldb_error_result' was never used; consider replacing with assignment to '_' or removing it var $__lldb_error_result = __lldb_tmp_error ~~~~^~~~~~~~~~~~~~~~~~~~ _ Before the crash, we observed this error message in the console: [ERROR] *****SCStream*****RemoteAudioQueueOperationHandlerWithError:1015 Error received from the remote queue -16665 Additional Context The issue occurs consistently after approximately 3-4 successful audio segment recordings of 60 seconds each Commenting out custom segment rotation logic does not prevent the crash The crash involves XPC communication with Apple's ReplayKit daemon The error appears to be corrupted or malformed when crossing the XPC boundary Workarounds Attempted Added proper thread safety for all published properties using DispatchQueue.main.async Implemented more robust error handling in the delegate methods None of these approaches prevented the crash since it occurs at the Swift runtime level before our code executes. Impact This issue prevents reliable long-duration audio capture using ScreenCaptureKit. This bug significantly limits the usefulness of ScreenCaptureKit for any application requiring continuous system audio capture for more than a few minutes. Perhaps this issue might be related to a macOS bug where the system dialog indicates that the screen is being shared, even though nothing is actually being shared. Moreover, when attempting to stop sharing, nothing happens.
3
0
970
Mar ’26
Airplay selection not working
I'm trying to implement airplay into my app. I can successfully playback sound and trigger the airplay selector sheet. If the target device is a Bluetooth only device I can connect with no problem and stream the audio to the Bluetooth device, but if the audio device is a airplay specific device like a HomePod or an Apple TV when I select it, I get a spinning icon, indicating that it is trying to connect, and eventually it times out and stops without connecting. I don't believe it is an AirPlay audio issue because if I go to a different app, for example a podcast app and select my HomePods for output, and then switch back to my app. My audio will correctly stream to the HomePod. Not only that, I have it so that my icon will change color to indicate that it is connected via airplay and it is correctly indicating that it is connected via AirPlay. But I cannot then disconnect it using the Airplay selector. The issue appears to be in the AirPlay selection side, which I have spent several days attempting to troubleshoot mostly using ChatGPT to suggest code different than what I have to maybe work around the issue. Mostly it is focused on the audio player section, but it doesn't seem like that is really the route that is the problem.
2
0
272
Jun ’25
SpeechTranscriber not providing audioTimeRange for most results
I started playing which transcription of audio files on macOS today, latest beta of Xcode and latest beta of Tahoe. Transcription itself works really well, but for some reason the majority of the results contain no audioTimeRange. I got 22 single-word results with time ranges, spread out all over total file of 53 minutes. Is there something I can do to improve this? To my understanding, I have followed sample code and instructions very closely, but the SwiftTranscriptionSampleApp and other examples I've seen lead me to believe I should be getting a lot more time ranges than I actually do.
3
0
231
Aug ’25
AVAudioUnit host - PCM buffer output silent
Hi, I just started to develop audio unit hosting support in my application. Offline rendering seems to work except that I hear no output, but why? I suspect with the player goes something wrong. I connect to CoreAudio in a different location in the code. Here are some error messages I faced so far: 2025-08-14 19:42:04.132930+0200 com.gsequencer.GSequencer[34358:18611871] [avae] AVAudioEngineGraph.mm:4668 Can't retrieve source node to play sequence because there is no output node! 2025-08-14 19:42:04.151171+0200 com.gsequencer.GSequencer[34358:18611871] [avae] AVAudioEngineGraph.mm:4668 Can't retrieve source node to play sequence because there is no output node! 2025-08-14 19:43:08.344530+0200 com.gsequencer.GSequencer[34358:18614927] AUAudioUnit.mm:1417 Cannot set maximumFramesToRender while render resources allocated. 2025-08-14 19:43:08.346583+0200 com.gsequencer.GSequencer[34358:18614927] [avae] AVAEInternal.h:104 [AVAudioSequencer.mm:121:-[AVAudioSequencer(AVAudioSequencer_Player) startAndReturnError:]: (impl->Start()): error -10852 ** (<unknown>:34358): WARNING **: 19:43:08.346: error during audio sequencer start - -10852 I have implemented an AVAudioEngine based AudioUnit host. Here I instantiate player and effect: /* audio engine */ audio_engine = [[AVAudioEngine alloc] init]; fx_audio_unit_audio->audio_engine = (gpointer) audio_engine; av_format = (AVAudioFormat *) fx_audio_unit_audio->av_format; /* av audio player node */ av_audio_player_node = [[AVAudioPlayerNode alloc] init]; /* av audio unit */ av_audio_unit_effect = [[AVAudioUnitEffect alloc] initWithAudioComponentDescription:[((AVAudioUnitComponent *) AGS_AUDIO_UNIT_PLUGIN(base_plugin)->component) audioComponentDescription]]; av_audio_unit = (AVAudioUnit *) av_audio_unit_effect; fx_audio_unit_audio->av_audio_unit = av_audio_unit; /* audio sequencer */ av_audio_sequencer = [[AVAudioSequencer alloc] initWithAudioEngine:audio_engine]; fx_audio_unit_audio->av_audio_sequencer = (gpointer) av_audio_sequencer; /* output node */ [[AVAudioOutputNode alloc] init]; /* audio player and audio unit */ [audio_engine attachNode:av_audio_player_node]; [audio_engine attachNode:av_audio_unit]; [audio_engine connect:av_audio_player_node to:av_audio_unit format:av_format]; [audio_engine connect:av_audio_unit to:[audio_engine outputNode] format:av_format]; ns_error = NULL; [audio_engine enableManualRenderingMode:AVAudioEngineManualRenderingModeOffline format:av_format maximumFrameCount:buffer_size error:&ns_error]; if(ns_error != NULL && [ns_error code] != noErr){ g_warning("enable manual rendering mode error - %d", [ns_error code]); } ns_error = NULL; [[av_audio_unit AUAudioUnit] allocateRenderResourcesAndReturnError:&ns_error]; if(ns_error != NULL && [ns_error code] != noErr){ g_warning("Audio Unit allocate render resources returned error - ErrorCode %d", [ns_error code]); } Then I render in a dedicated thread. ns_error = NULL; [audio_engine startAndReturnError:&ns_error]; if(ns_error != NULL && [ns_error code] != noErr){ g_warning("error during audio engine start - %d", [ns_error code]); } [av_audio_sequencer prepareToPlay]; ns_error = NULL; [av_audio_sequencer startAndReturnError:&ns_error]; if(ns_error != NULL && [ns_error code] != noErr){ g_warning("error during audio sequencer start - %d", [ns_error code]); } [av_audio_player_node play]; while(is_running){ /* pre sync */ /* IO buffers */ av_output_buffer = (AVAudioPCMBuffer *) scope_data->av_output_buffer; av_input_buffer = (AVAudioPCMBuffer *) scope_data->av_input_buffer; /* fill input buffer */ /* schedule av input buffer */ frame_position = 0; // (gint64) ((note_offset * absolute_delay) + delay_counter) * buffer_size; av_audio_player_node = (AVAudioPlayerNode *) fx_audio_unit_audio->av_audio_player_node; AVAudioTime *av_audio_time = [[AVAudioTime alloc] initWithHostTime:frame_position sampleTime:frame_position atRate:((double) samplerate)]; [av_audio_player_node scheduleBuffer:av_input_buffer atTime:av_audio_time options:0 completionHandler:nil]; /* render */ ns_error = NULL; status = [audio_engine renderOffline:AGS_FX_AUDIO_UNIT_AUDIO_FIXED_BUFFER_SIZE toBuffer:av_output_buffer error:&ns_error]; if(ns_error != NULL && [ns_error code] != noErr){ g_warning("render offline error - %d", [ns_error code]); } } regards, Joël
3
0
541
Aug ’25
AVAudioSessionCategoryOptionAllowBluetooth incorrectly marked as deprecated in iOS 8 in iOS 26 beta 5
AVAudioSessionCategoryOptionAllowBluetooth is marked as deprecated in iOS 8 in iOS 26 beta 5 when this option was not deprecated in iOS 18.6. I think this is a mistake and the deprecation is in iOS 26. Am I right? It seems that the substitute for this option is "AVAudioSessionCategoryOptionAllowBluetoothHFP". The documentation does not make clear if the behaviour is exactly the same or if any difference should be expected... Has anyone used this option in iOS 26? Should I expect any difference with the current behaviour of "AVAudioSessionCategoryOptionAllowBluetooth"? Thank you.
2
0
357
Aug ’25
Apple Music iOS 26 features in Android
Since many users like me use Apple Music on Android, the app is almost as feature-rich as iOS. It would be fantastic if the developers could add the new iOS 26 features to the Android app, along with a minor UI change. I know it’s challenging to implement liquid glass on Android hardware or design, but features like auto-mix, pronunciation, and translation could be added. kindly consider this request !!!!
1
0
289
Jul ’25
Delay in Microphone Input When Talking While Receiving Audio in PTT Framework (Full Duplex Mode)
Context: I am currently developing an app using the Push-to-Talk (PTT) framework. I have reviewed both the PTT framework documentation and the CallKit demo project to better understand how to properly manage audio session activation and AVAudioEngine setup. I am not activating the audio session manually. The audio session configuration is handled in the incomingPushResult or didBeginTransmitting callbacks from the PTChannelManagerDelegate. I am using a single AVAudioEngine instance for both input and playback. The engine is started in the didActivate callback from the PTChannelManagerDelegate. When I receive a push in full duplex mode, I set the active participant to the user who is speaking. Issue When I attempt to talk while the other participant is already speaking, my input tap on the input node takes a few seconds to return valid PCM audio data. Initially, it returns an empty PCM audio block. Details: The audio session is already active and configured with .playAndRecord. The input tap is already installed when the engine is started. When I talk from a neutral state (no one is speaking), the system plays the standard "microphone activation" tone, which covers this initial delay. However, this does not happen when I am already receiving audio. Assumptions / Current Setup Because the audio session is active in play and record, I assumed that microphone input would be available immediately, even while receiving audio. However, there seems to be a delay before valid input is delivered to the tap, only occurring when switching from a receive state to simultaneously talking. Questions Is this expected behavior when using the PTT framework in full duplex mode with a shared AVAudioEngine? Should I be restarting or reconfiguring the engine or audio session when beginning to talk while receiving audio? Is there a recommended pattern for managing microphone readiness in this scenario to avoid the initial empty PCM buffer? Would using separate engines for input and output improve responsiveness? I would like to confirm the correct approach to handling simultaneous talk and receive in full duplex mode using PTT framework and AVAudioEngine. Specifically, I need guidance on ensuring the microphone is ready to capture audio immediately without the delay seen in my current implementation. Relevant Code Snippets Engine Setup func setup() { let input = audioEngine.inputNode do { try input.setVoiceProcessingEnabled(true) } catch { print("Could not enable voice processing \(error)") return } input.isVoiceProcessingAGCEnabled = false let output = audioEngine.outputNode let mainMixer = audioEngine.mainMixerNode audioEngine.connect(pttPlayerNode, to: mainMixer, format: outputFormat) audioEngine.connect(beepNode, to: mainMixer, format: outputFormat) audioEngine.connect(mainMixer, to: output, format: outputFormat) // Initialize converters converter = AVAudioConverter(from: inputFormat, to: outputFormat)! f32ToInt16Converter = AVAudioConverter(from: outputFormat, to: inputFormat)! audioEngine.prepare() } Input Tap Installation func installTap() { guard AudioHandler.shared.checkMicrophonePermission() else { print("Microphone not granted for recording") return } guard !isInputTapped else { print("[AudioEngine] Input is already tapped!") return } let input = audioEngine.inputNode let microphoneFormat = input.inputFormat(forBus: 0) let microphoneDownsampler = AVAudioConverter(from: microphoneFormat, to: outputFormat)! let desiredFormat = outputFormat let inputFramesNeeded = AVAudioFrameCount((Double(OpusCodec.DECODED_PACKET_NUM_SAMPLES) * microphoneFormat.sampleRate) / desiredFormat.sampleRate) input.installTap(onBus: 0, bufferSize: inputFramesNeeded, format: input.inputFormat(forBus: 0)) { [weak self] buffer, when in guard let self = self else { return } // Output buffer: 1920 frames at 16kHz guard let outputBuffer = AVAudioPCMBuffer(pcmFormat: desiredFormat, frameCapacity: AVAudioFrameCount(OpusCodec.DECODED_PACKET_NUM_SAMPLES)) else { return } outputBuffer.frameLength = outputBuffer.frameCapacity let inputBlock: AVAudioConverterInputBlock = { inNumPackets, outStatus in outStatus.pointee = .haveData return buffer } var error: NSError? let converterResult = microphoneDownsampler.convert(to: outputBuffer, error: &error, withInputFrom: inputBlock) if converterResult != .haveData { DebugLogger.shared.print("Downsample error \(converterResult)") } else { self.handleDownsampledBuffer(outputBuffer) } } isInputTapped = true }
4
0
534
Aug ’25
Detecting if a phone call is being recorded by another app on iOS
Hello, I’m new here. I'm developing an iOS app and I’d like to know whether it is possible to detect if a phone call is being recorded by another app running in the background. I’ve already reviewed the documentation for CallKit and AVAudioSession, but I couldn’t find anything related. My expectation was that iOS might provide some callback or API to indicate if a call is being recorded (third-party apps), but so far I haven’t found a way. My questions are: Does iOS expose any API to detect if a call is being recorded? If not, is there any indirect, Apple's policy compliant method (e.g., microphone usage events) that can be relied upon? Or is this something that iOS explicitly prevents for privacyreasons? Expecting solutions that align with Apple’s policies and would be accepted under the App Store Review Guidelines. Thanks in advance for any guidance.
1
0
307
Aug ’25
SpeechAnalyzer error "asset not found after attempted download" for certain languages
I am trying to use the new SpeechAnalyzer framework in my Mac app, and am running into an issue for some languages. When I call AssetInstallationRequest.downloadAndInstall() for some languages, it throws an error: Error Domain=SFSpeechErrorDomain Code=1 "transcription.ar asset not found after attempted download." The ".ar" appears to be the language code, which in this case was Arabic. When I call AssetInventory.status(forModules:) before attempting the download, it is giving me a status of "downloading" (perhaps from an earlier attempt?). If this language was completely unsupported, I would expect it to return a status of "unsupported", so I'm not sure what's going on here. For other languages (Polish, for example) SpeechTranscriber.supportedLocale(equivalentTo:) is returning nil, so that seems like a clearly unsupported language. But I can't tell if the languages I'm trying, like Arabic, are supported and something is going wrong, or if this error represents something I can work around. Here's the relevant section of code. The error is thrown from downloadAndInstall(), so I never even get as far as setting up the SpeechAnalyzer itself. private func setUpAnalyzer() async throws { guard let sourceLanguage else { throw Error.languageNotSpecified } guard let locale = await SpeechTranscriber.supportedLocale(equivalentTo: Locale(identifier: sourceLanguage.rawValue)) else { throw Error.unsupportedLanguage } let transcriber = SpeechTranscriber(locale: locale, preset: .progressiveTranscription) self.transcriber = transcriber let reservedLocales = await AssetInventory.reservedLocales if !reservedLocales.contains(locale) && reservedLocales.count == AssetInventory.maximumReservedLocales { if let oldest = reservedLocales.last { await AssetInventory.release(reservedLocale: oldest) } } do { let status = await AssetInventory.status(forModules: [transcriber]) print("status: \(status)") if let installationRequest = try await AssetInventory.assetInstallationRequest(supporting: [transcriber]) { try await installationRequest.downloadAndInstall() } } ...
9
0
1.2k
Apr ’26
Is Call Translation API available for VOIP?
I might have misunderstood the docs, but is Call Translation going to be available for VOIP applications? Eg in an already connected VOIP call, would it be possible for Call Translations to be enabled on an iOS 26 and Apple Intelligence supported device? I have personally tried it and it doesn’t look like it supported VOIP but would love to confirm this. reference: https://developer.apple.com/documentation/callkit/cxsettranslatingcallaction/
Replies
1
Boosts
0
Views
84
Activity
Jun ’25
Convert CoreAudio AudioObjectID to IOUSB LocationID
Is there a recommended way on macOS 26 Tahoe to take a CoreAudio AudioObjectID and use it to lookup the underlying USB LocationID? I previously used AudioObjectID to query the corresponding DeviceUID with kAudioDevicePropertyDeviceUID. Then I queried for the IOService matching kIOAudioEngineClassName with property kIOAudioEngineGlobalUniqueIDKey matching DeviceUID, and I loaded kUSBDevicePropertyLocationID from the result. This fails on macOS 26, because the IO Registry for the device has an entry for usbaudiod rather than AppleUSBAudioEngine, and usbaudiod does not include a kIOAudioEngineGlobalUniqueIDKey property (or any other property to map it to a CoreAudio DeviceUID). My use-case here is a piece of audio recording software that allows configuring a set of supported audio devices via USB HID prior to recording. I present the user with a list of CoreAudio devices to use, but without a way to lookup the underlying USB LocationID, I cannot guarantee that the configured device matches the selected device (e.g. if the user plugged in two identical microphones).
Replies
2
Boosts
0
Views
695
Activity
Sep ’25
(iOS 18) SFSpeechRecognitionResult providing new text after a gap in speaking
Here is the demo from Apple's site This issues is specific to iOS 18. When running this demo, we are getting new text when we have a gap in speaking, the recognitionTask(with:resultHandler:) provides new text which is only spoken after the gap and not the concatenation of old text and the new spoken text.
Replies
6
Boosts
0
Views
1.3k
Activity
May ’25
[26] audioTimeRange would still be interesting for .volatileResults in SpeechTranscriber
So experimenting with the new SpeechTranscriber, if I do: let transcriber = SpeechTranscriber( locale: locale, transcriptionOptions: [], reportingOptions: [.volatileResults], attributeOptions: [.audioTimeRange] ) only the final result has audio time ranges, not the volatile results. Is this a performance consideration? If there is no performance problem, it would be nice to have the option to also get speech time ranges for volatile responses. I'm not presenting the volatile text at all in the UI, I was just trying to keep statistics about the non-speech and the speech noise level, this way I can determine when the noise level falls under the noisefloor for a while. The goal here was to finalize the recording automatically, when the noise level indicate that the user has finished speaking.
Replies
6
Boosts
0
Views
791
Activity
Nov ’25
ShazamKit supported for iOS apps that can run on Mac silicon?
I am having issues deploying my iOS app, that uses ShazamKit, to get working on a Mac with Apple silicon. When uploading the archive to App Store Connect I do get ITMS-90863: Macs with Apple silicon support issue - The app links with libraries that aren’t present in macOS: /usr/lib/swift/libswiftShazamKit.dylib Is ShazamKit not supported for iOS apps that can run on Macs with Apple silicon? Or is there something I should fix in my setup / deployment?
Replies
26
Boosts
0
Views
1.2k
Activity
Jun ’25
AVB Support for the AVnu MILAN Conventions
The AVB AVnu MILAN Convention has a groweing Population. Many big companies (Cisco, Meyer Sound, d&b Audio, l‘acoustics, Presonus, digico etc.) implements the AVB AVnu Milan Standards. Is there a plan on the Apple side to also implement AVnu Milan on top of the AVB Protocol? The advantage for Apple Sound would be a great Integration in the professionell Audio market and a more stable intergration on top of the AVB protocol. The atdecc work, but Not that stable.
Replies
1
Boosts
0
Views
185
Activity
Oct ’25
Is AVAudioPCMFormatFloat32 required for playing a buffer with AVAudioEngine / AVAudioPlayerNode
I have a PCM audio buffer (AVAudioPCMFormatInt16). When I try to play it using AVPlayerNode / AVAudioEngine an exception is thrown: "[[busArray objectAtIndexedSubscript:(NSUInteger)element] setFormat:format error:&nsErr]: returned false, error Error Domain=NSOSStatusErrorDomain Code=-10868 (related thread https://forums.developer.apple.com/forums/thread/700497?answerId=780530022#780530022) If I convert the buffer to AVAudioPCMFormatFloat32 playback works. My questions are: Does AVAudioEngine / AVPlayerNode require AVAudioPCMBuffer to be in the Float32 format? Is there a way I can configure it to accept another format instead for my application? If 1 is YES is this documented anywhere? If 1 is YES is this required format subject to change at any point? Thanks! I was looking to watch the "AVAudioEngine in Practice" session video from WWDC 2014 but I can't find it anywhere (https://forums.developer.apple.com/forums/thread/747008).
Replies
1
Boosts
0
Views
1.1k
Activity
Oct ’25
How to disable/hide Audio Controls on lock screen from WkWebView
Hi, I am trying to remove the audio controls for my app on the lock screen. Since I use WKWebView, there are 3 audio tags in my html and I play and pause em via JS. However, if I do not play any sound since app launch, there are no audio controls on the lock screen. But if I play one of those 3 files (they are even less then 3 Sec sound effects e.g. for buttons) the audio controls appears on lock screen. Note even when the sounds on pause() or not playing they were listed on the lock screen. What I have tried so far without success MPNowPlayingInfoCenter.default().nowPlayingInfo = [:] and ``try audioSession.setCategory(.playback, mode: .default, options: []) try audioSession.setActive(false, options: .notifyOthersOnDeactivation)`` and UIApplication.shared.endReceivingRemoteControlEvents() Another problem is that the app scales with iOS system settings "display zoom". Is there a way to deny it? It is latest Xcode verion 16.3 and iOS 18. I have no background mode in my Capabilities. Nothing worked so far. Has anyone an idea? Greetings
Replies
2
Boosts
0
Views
152
Activity
May ’25
SpeechTranscriber/SpeechAnalyzer being relatively slow compared to FoundationModel and TTS
So, I've been wondering how fast a an offline STT -> ML Prompt -> TTS roundtrip would be. Interestingly, for many tests, the SpeechTranscriber (STT) takes the bulk of the time, compared to generating a FoundationModel response and creating the Audio using TTS. E.g. InteractionStatistics: - listeningStarted: 21:24:23 4480 2423 - timeTillFirstAboveNoiseFloor: 01.794 - timeTillLastNoiseAboveFloor: 02.383 - timeTillFirstSpeechDetected: 02.399 - timeTillTranscriptFinalized: 04.510 - timeTillFirstMLModelResponse: 04.938 - timeTillMLModelResponse: 05.379 - timeTillTTSStarted: 04.962 - timeTillTTSFinished: 11.016 - speechLength: 06.054 - timeToResponse: 02.578 - transcript: This is a test. - mlModelResponse: Sure! I'm ready to help with your test. What do you need help with? Here, between my audio input ending and the Text-2-Speech starting top play (using AVSpeechUtterance) the total response time was 2.5s. Of that time, it took the SpeechAnalyzer 2.1s to get the transcript finalized, FoundationModel only took 0.4s to respond (and TTS started playing nearly instantly). I'm already using reportingOptions: [.volatileResults, .fastResults] so it's probably as fast as possible right now? I'm just surprised the STT takes so much longer compared to the other parts (all being CoreML based, aren't they?)
Replies
3
Boosts
0
Views
810
Activity
Mar ’26
iOS 26 HLS Audio Track Display Behavior: EXT-X-MEDIA NAME vs LANGUAGE Attributes
Hello Apple Developer Community, I am seeking clarification on the intended display behavior of HLS audio tracks within the iOS 26 (or current beta) native player, specifically concerning the NAME and LANGUAGE attributes of the EXT-X-MEDIA tag. In our HLS manifests, we define alternative audio tracks using EXT-X-MEDIA tags, like so: #EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",LANGUAGE="ja",NAME="AUDIO-1",DEFAULT=YES,AUTOSELECT=YES,URI="audio_ja.m3u8" #EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",LANGUAGE="ja",NAME="AUDIO-2",URI="audio_en.m3u8" Our observation is that when an audio track is selected and its name is displayed in the native iOS media controls (e.g., Control Center or within a full-screen video player's UI), the value specified in the NAME attribute ("AUDIO-1", "AUDIO-2") does not seem to be used. Instead, the display appears to derive from the LANGUAGE attribute ("ja", "en"), often showing the system's localized string for that language (e.g., "Japanese", "English"). We would like to understand the official or intended behavior regarding this. Is it the expected behavior for the iOS native player to prioritize the LANGUAGE attribute (or its localized equivalent) over the NAME attribute for displaying the selected audio track's label? If this is the intended design, what is the recommended best practice for developers who wish to present a custom, human-readable name for audio tracks (beyond the standard language name) in the native iOS UI? Are there any specific AVPlayer properties or AVMediaSelectionOption considerations that would allow more granular control over this display, or is this entirely managed by the system based on the LANGUAGE attribute? Any insights or official guidance on this behavior in iOS 26 (and potentially previous versions) would be greatly appreciated. Thank you for your time and assistance.
Replies
2
Boosts
0
Views
446
Activity
Aug ’25
ScreenCaptureKit System Audio Capture Crashes with EXC_BAD_ACCESS
Bug Report: ScreenCaptureKit System Audio Capture Crashes with EXC_BAD_ACCESS Summary When using ScreenCaptureKit to capture system audio for extended periods, the application crashes with EXC_BAD_ACCESS in Swift's error handling runtime. The crash occurs in swift_getErrorValue when trying to process an error from the SCStream delegate method didStopWithError. This appears to be a framework-level issue in ScreenCaptureKit or its underlying ReplayKit implementation. Environment macOS Sonoma 14.6.1 Swift 5.8 ScreenCaptureKit framework Detailed Description Our application captures system audio using ScreenCaptureKit's audio capture capabilities. After successfully capturing for several minutes (typically after 3-4 segments of 60-second recordings), the application crashes with an EXC_BAD_ACCESS error. The crash happens when the Swift runtime attempts to process an error in the SCStreamDelegate.stream(_:didStopWithError:) method. The crash consistently occurs in swift_getErrorValue when attempting to access the class of what appears to be a null object. This suggests that the error being passed from the system framework to our delegate method is malformed or contains invalid memory. Steps to Reproduce Create an SCStream with audio capture enabled Add audio output to the stream Start capture and write audio data to disk Allow the capture to run for several minutes (3-5 minutes typically triggers the issue) The app will crash with EXC_BAD_ACCESS in swift_getErrorValue Code Sample func stream(_ stream: SCStream, didStopWithError error: Error) { print("Stream stopped with error: \(error)") // Crash occurs before this line executes } func stream(_ stream: SCStream, didOutputSampleBuffer sampleBuffer: CMSampleBuffer, of type: SCStreamOutputType) { guard type == .audio, sampleBuffer.isValid else { return } // Process audio data... } Expected Behavior The error should be properly propagated to the delegate method, allowing for graceful error handling and recovery. Actual Behavior The application crashes with EXC_BAD_ACCESS when the Swift runtime attempts to process the error in swift_getErrorValue. Crash Log Details Thread #35, queue = 'com.apple.NSXPCConnection.m-user.com.apple.replayd', stop reason = EXC_BAD_ACCESS (code=1, address=0x0) frame #0: 0x0000000194c3088c libswiftCore.dylib`swift::_swift_getClass(void const*) + 8 frame #1: 0x0000000194c30104 libswiftCore.dylib`swift_getErrorValue + 40 frame #2: 0x00000001057fba30 shadow`NewScreenCaptureService.stream(stream=0x0000600002de6700, error=Swift.Error @ 0x000000016b7b5e30) at NEW+ScreenCaptureService.swift:365:15 frame #3: 0x00000001057fc050 shadow`@objc NewScreenCaptureService.stream(_:didStopWithError:) at <compiler-generated>:0 frame #4: 0x0000000219ec5ca0 ScreenCaptureKit`-[SCStreamManager stream:didStopWithError:] + 456 frame #5: 0x00000001ca68a5cc ReplayKit`-[RPScreenRecorder stream:didStopWithError:] + 84 frame #6: 0x00000001ca696ff8 ReplayKit`-[RPDaemonProxy stream:didStopWithError:] + 224 Printing description of stream._streamQueue: error: ObjectiveC.id:4294967281:18: note: 'id' has been explicitly marked unavailable here public typealias id = AnyObject ^ error: /var/folders/v4/3xg1hmp93gjd8_xlzmryf_wm0000gn/T/expr23-dfa421..cpp:1:65: 'id' is unavailable in Swift: 'id' is not available in Swift; use 'Any' Swift._DebuggerSupport.stringForPrintObject(Swift.UnsafePointer<id>(bitPattern: 0x104ae08c0)!.pointee) ^~ ObjectiveC.id:2:18: note: 'id' has been explicitly marked unavailable here public typealias id = AnyObject ^ warning: /var/folders/v4/3xg1hmp93gjd8_xlzmryf_wm0000gn/T/expr23-dfa421..cpp:5:7: initialization of variable '$__lldb_error_result' was never used; consider replacing with assignment to '_' or removing it var $__lldb_error_result = __lldb_tmp_error ~~~~^~~~~~~~~~~~~~~~~~~~ _ Before the crash, we observed this error message in the console: [ERROR] *****SCStream*****RemoteAudioQueueOperationHandlerWithError:1015 Error received from the remote queue -16665 Additional Context The issue occurs consistently after approximately 3-4 successful audio segment recordings of 60 seconds each Commenting out custom segment rotation logic does not prevent the crash The crash involves XPC communication with Apple's ReplayKit daemon The error appears to be corrupted or malformed when crossing the XPC boundary Workarounds Attempted Added proper thread safety for all published properties using DispatchQueue.main.async Implemented more robust error handling in the delegate methods None of these approaches prevented the crash since it occurs at the Swift runtime level before our code executes. Impact This issue prevents reliable long-duration audio capture using ScreenCaptureKit. This bug significantly limits the usefulness of ScreenCaptureKit for any application requiring continuous system audio capture for more than a few minutes. Perhaps this issue might be related to a macOS bug where the system dialog indicates that the screen is being shared, even though nothing is actually being shared. Moreover, when attempting to stop sharing, nothing happens.
Replies
3
Boosts
0
Views
970
Activity
Mar ’26
Airplay selection not working
I'm trying to implement airplay into my app. I can successfully playback sound and trigger the airplay selector sheet. If the target device is a Bluetooth only device I can connect with no problem and stream the audio to the Bluetooth device, but if the audio device is a airplay specific device like a HomePod or an Apple TV when I select it, I get a spinning icon, indicating that it is trying to connect, and eventually it times out and stops without connecting. I don't believe it is an AirPlay audio issue because if I go to a different app, for example a podcast app and select my HomePods for output, and then switch back to my app. My audio will correctly stream to the HomePod. Not only that, I have it so that my icon will change color to indicate that it is connected via airplay and it is correctly indicating that it is connected via AirPlay. But I cannot then disconnect it using the Airplay selector. The issue appears to be in the AirPlay selection side, which I have spent several days attempting to troubleshoot mostly using ChatGPT to suggest code different than what I have to maybe work around the issue. Mostly it is focused on the audio player section, but it doesn't seem like that is really the route that is the problem.
Replies
2
Boosts
0
Views
272
Activity
Jun ’25
SpeechTranscriber not providing audioTimeRange for most results
I started playing which transcription of audio files on macOS today, latest beta of Xcode and latest beta of Tahoe. Transcription itself works really well, but for some reason the majority of the results contain no audioTimeRange. I got 22 single-word results with time ranges, spread out all over total file of 53 minutes. Is there something I can do to improve this? To my understanding, I have followed sample code and instructions very closely, but the SwiftTranscriptionSampleApp and other examples I've seen lead me to believe I should be getting a lot more time ranges than I actually do.
Replies
3
Boosts
0
Views
231
Activity
Aug ’25
AVAudioUnit host - PCM buffer output silent
Hi, I just started to develop audio unit hosting support in my application. Offline rendering seems to work except that I hear no output, but why? I suspect with the player goes something wrong. I connect to CoreAudio in a different location in the code. Here are some error messages I faced so far: 2025-08-14 19:42:04.132930+0200 com.gsequencer.GSequencer[34358:18611871] [avae] AVAudioEngineGraph.mm:4668 Can't retrieve source node to play sequence because there is no output node! 2025-08-14 19:42:04.151171+0200 com.gsequencer.GSequencer[34358:18611871] [avae] AVAudioEngineGraph.mm:4668 Can't retrieve source node to play sequence because there is no output node! 2025-08-14 19:43:08.344530+0200 com.gsequencer.GSequencer[34358:18614927] AUAudioUnit.mm:1417 Cannot set maximumFramesToRender while render resources allocated. 2025-08-14 19:43:08.346583+0200 com.gsequencer.GSequencer[34358:18614927] [avae] AVAEInternal.h:104 [AVAudioSequencer.mm:121:-[AVAudioSequencer(AVAudioSequencer_Player) startAndReturnError:]: (impl->Start()): error -10852 ** (<unknown>:34358): WARNING **: 19:43:08.346: error during audio sequencer start - -10852 I have implemented an AVAudioEngine based AudioUnit host. Here I instantiate player and effect: /* audio engine */ audio_engine = [[AVAudioEngine alloc] init]; fx_audio_unit_audio->audio_engine = (gpointer) audio_engine; av_format = (AVAudioFormat *) fx_audio_unit_audio->av_format; /* av audio player node */ av_audio_player_node = [[AVAudioPlayerNode alloc] init]; /* av audio unit */ av_audio_unit_effect = [[AVAudioUnitEffect alloc] initWithAudioComponentDescription:[((AVAudioUnitComponent *) AGS_AUDIO_UNIT_PLUGIN(base_plugin)->component) audioComponentDescription]]; av_audio_unit = (AVAudioUnit *) av_audio_unit_effect; fx_audio_unit_audio->av_audio_unit = av_audio_unit; /* audio sequencer */ av_audio_sequencer = [[AVAudioSequencer alloc] initWithAudioEngine:audio_engine]; fx_audio_unit_audio->av_audio_sequencer = (gpointer) av_audio_sequencer; /* output node */ [[AVAudioOutputNode alloc] init]; /* audio player and audio unit */ [audio_engine attachNode:av_audio_player_node]; [audio_engine attachNode:av_audio_unit]; [audio_engine connect:av_audio_player_node to:av_audio_unit format:av_format]; [audio_engine connect:av_audio_unit to:[audio_engine outputNode] format:av_format]; ns_error = NULL; [audio_engine enableManualRenderingMode:AVAudioEngineManualRenderingModeOffline format:av_format maximumFrameCount:buffer_size error:&ns_error]; if(ns_error != NULL && [ns_error code] != noErr){ g_warning("enable manual rendering mode error - %d", [ns_error code]); } ns_error = NULL; [[av_audio_unit AUAudioUnit] allocateRenderResourcesAndReturnError:&ns_error]; if(ns_error != NULL && [ns_error code] != noErr){ g_warning("Audio Unit allocate render resources returned error - ErrorCode %d", [ns_error code]); } Then I render in a dedicated thread. ns_error = NULL; [audio_engine startAndReturnError:&ns_error]; if(ns_error != NULL && [ns_error code] != noErr){ g_warning("error during audio engine start - %d", [ns_error code]); } [av_audio_sequencer prepareToPlay]; ns_error = NULL; [av_audio_sequencer startAndReturnError:&ns_error]; if(ns_error != NULL && [ns_error code] != noErr){ g_warning("error during audio sequencer start - %d", [ns_error code]); } [av_audio_player_node play]; while(is_running){ /* pre sync */ /* IO buffers */ av_output_buffer = (AVAudioPCMBuffer *) scope_data->av_output_buffer; av_input_buffer = (AVAudioPCMBuffer *) scope_data->av_input_buffer; /* fill input buffer */ /* schedule av input buffer */ frame_position = 0; // (gint64) ((note_offset * absolute_delay) + delay_counter) * buffer_size; av_audio_player_node = (AVAudioPlayerNode *) fx_audio_unit_audio->av_audio_player_node; AVAudioTime *av_audio_time = [[AVAudioTime alloc] initWithHostTime:frame_position sampleTime:frame_position atRate:((double) samplerate)]; [av_audio_player_node scheduleBuffer:av_input_buffer atTime:av_audio_time options:0 completionHandler:nil]; /* render */ ns_error = NULL; status = [audio_engine renderOffline:AGS_FX_AUDIO_UNIT_AUDIO_FIXED_BUFFER_SIZE toBuffer:av_output_buffer error:&ns_error]; if(ns_error != NULL && [ns_error code] != noErr){ g_warning("render offline error - %d", [ns_error code]); } } regards, Joël
Replies
3
Boosts
0
Views
541
Activity
Aug ’25
AVAudioSessionCategoryOptionAllowBluetooth incorrectly marked as deprecated in iOS 8 in iOS 26 beta 5
AVAudioSessionCategoryOptionAllowBluetooth is marked as deprecated in iOS 8 in iOS 26 beta 5 when this option was not deprecated in iOS 18.6. I think this is a mistake and the deprecation is in iOS 26. Am I right? It seems that the substitute for this option is "AVAudioSessionCategoryOptionAllowBluetoothHFP". The documentation does not make clear if the behaviour is exactly the same or if any difference should be expected... Has anyone used this option in iOS 26? Should I expect any difference with the current behaviour of "AVAudioSessionCategoryOptionAllowBluetooth"? Thank you.
Replies
2
Boosts
0
Views
357
Activity
Aug ’25
Apple Music iOS 26 features in Android
Since many users like me use Apple Music on Android, the app is almost as feature-rich as iOS. It would be fantastic if the developers could add the new iOS 26 features to the Android app, along with a minor UI change. I know it’s challenging to implement liquid glass on Android hardware or design, but features like auto-mix, pronunciation, and translation could be added. kindly consider this request !!!!
Replies
1
Boosts
0
Views
289
Activity
Jul ’25
Delay in Microphone Input When Talking While Receiving Audio in PTT Framework (Full Duplex Mode)
Context: I am currently developing an app using the Push-to-Talk (PTT) framework. I have reviewed both the PTT framework documentation and the CallKit demo project to better understand how to properly manage audio session activation and AVAudioEngine setup. I am not activating the audio session manually. The audio session configuration is handled in the incomingPushResult or didBeginTransmitting callbacks from the PTChannelManagerDelegate. I am using a single AVAudioEngine instance for both input and playback. The engine is started in the didActivate callback from the PTChannelManagerDelegate. When I receive a push in full duplex mode, I set the active participant to the user who is speaking. Issue When I attempt to talk while the other participant is already speaking, my input tap on the input node takes a few seconds to return valid PCM audio data. Initially, it returns an empty PCM audio block. Details: The audio session is already active and configured with .playAndRecord. The input tap is already installed when the engine is started. When I talk from a neutral state (no one is speaking), the system plays the standard "microphone activation" tone, which covers this initial delay. However, this does not happen when I am already receiving audio. Assumptions / Current Setup Because the audio session is active in play and record, I assumed that microphone input would be available immediately, even while receiving audio. However, there seems to be a delay before valid input is delivered to the tap, only occurring when switching from a receive state to simultaneously talking. Questions Is this expected behavior when using the PTT framework in full duplex mode with a shared AVAudioEngine? Should I be restarting or reconfiguring the engine or audio session when beginning to talk while receiving audio? Is there a recommended pattern for managing microphone readiness in this scenario to avoid the initial empty PCM buffer? Would using separate engines for input and output improve responsiveness? I would like to confirm the correct approach to handling simultaneous talk and receive in full duplex mode using PTT framework and AVAudioEngine. Specifically, I need guidance on ensuring the microphone is ready to capture audio immediately without the delay seen in my current implementation. Relevant Code Snippets Engine Setup func setup() { let input = audioEngine.inputNode do { try input.setVoiceProcessingEnabled(true) } catch { print("Could not enable voice processing \(error)") return } input.isVoiceProcessingAGCEnabled = false let output = audioEngine.outputNode let mainMixer = audioEngine.mainMixerNode audioEngine.connect(pttPlayerNode, to: mainMixer, format: outputFormat) audioEngine.connect(beepNode, to: mainMixer, format: outputFormat) audioEngine.connect(mainMixer, to: output, format: outputFormat) // Initialize converters converter = AVAudioConverter(from: inputFormat, to: outputFormat)! f32ToInt16Converter = AVAudioConverter(from: outputFormat, to: inputFormat)! audioEngine.prepare() } Input Tap Installation func installTap() { guard AudioHandler.shared.checkMicrophonePermission() else { print("Microphone not granted for recording") return } guard !isInputTapped else { print("[AudioEngine] Input is already tapped!") return } let input = audioEngine.inputNode let microphoneFormat = input.inputFormat(forBus: 0) let microphoneDownsampler = AVAudioConverter(from: microphoneFormat, to: outputFormat)! let desiredFormat = outputFormat let inputFramesNeeded = AVAudioFrameCount((Double(OpusCodec.DECODED_PACKET_NUM_SAMPLES) * microphoneFormat.sampleRate) / desiredFormat.sampleRate) input.installTap(onBus: 0, bufferSize: inputFramesNeeded, format: input.inputFormat(forBus: 0)) { [weak self] buffer, when in guard let self = self else { return } // Output buffer: 1920 frames at 16kHz guard let outputBuffer = AVAudioPCMBuffer(pcmFormat: desiredFormat, frameCapacity: AVAudioFrameCount(OpusCodec.DECODED_PACKET_NUM_SAMPLES)) else { return } outputBuffer.frameLength = outputBuffer.frameCapacity let inputBlock: AVAudioConverterInputBlock = { inNumPackets, outStatus in outStatus.pointee = .haveData return buffer } var error: NSError? let converterResult = microphoneDownsampler.convert(to: outputBuffer, error: &error, withInputFrom: inputBlock) if converterResult != .haveData { DebugLogger.shared.print("Downsample error \(converterResult)") } else { self.handleDownsampledBuffer(outputBuffer) } } isInputTapped = true }
Replies
4
Boosts
0
Views
534
Activity
Aug ’25
SIGABORT with ExtAudioFileWrite and .m4a file
Hi, I am getting into a trap. Please check stack-trace, howto fix this? regards, Joël stack-trace with ExtAudioFileWrite
Replies
2
Boosts
0
Views
994
Activity
Jun ’25
Detecting if a phone call is being recorded by another app on iOS
Hello, I’m new here. I'm developing an iOS app and I’d like to know whether it is possible to detect if a phone call is being recorded by another app running in the background. I’ve already reviewed the documentation for CallKit and AVAudioSession, but I couldn’t find anything related. My expectation was that iOS might provide some callback or API to indicate if a call is being recorded (third-party apps), but so far I haven’t found a way. My questions are: Does iOS expose any API to detect if a call is being recorded? If not, is there any indirect, Apple's policy compliant method (e.g., microphone usage events) that can be relied upon? Or is this something that iOS explicitly prevents for privacyreasons? Expecting solutions that align with Apple’s policies and would be accepted under the App Store Review Guidelines. Thanks in advance for any guidance.
Replies
1
Boosts
0
Views
307
Activity
Aug ’25
SpeechAnalyzer error "asset not found after attempted download" for certain languages
I am trying to use the new SpeechAnalyzer framework in my Mac app, and am running into an issue for some languages. When I call AssetInstallationRequest.downloadAndInstall() for some languages, it throws an error: Error Domain=SFSpeechErrorDomain Code=1 "transcription.ar asset not found after attempted download." The ".ar" appears to be the language code, which in this case was Arabic. When I call AssetInventory.status(forModules:) before attempting the download, it is giving me a status of "downloading" (perhaps from an earlier attempt?). If this language was completely unsupported, I would expect it to return a status of "unsupported", so I'm not sure what's going on here. For other languages (Polish, for example) SpeechTranscriber.supportedLocale(equivalentTo:) is returning nil, so that seems like a clearly unsupported language. But I can't tell if the languages I'm trying, like Arabic, are supported and something is going wrong, or if this error represents something I can work around. Here's the relevant section of code. The error is thrown from downloadAndInstall(), so I never even get as far as setting up the SpeechAnalyzer itself. private func setUpAnalyzer() async throws { guard let sourceLanguage else { throw Error.languageNotSpecified } guard let locale = await SpeechTranscriber.supportedLocale(equivalentTo: Locale(identifier: sourceLanguage.rawValue)) else { throw Error.unsupportedLanguage } let transcriber = SpeechTranscriber(locale: locale, preset: .progressiveTranscription) self.transcriber = transcriber let reservedLocales = await AssetInventory.reservedLocales if !reservedLocales.contains(locale) && reservedLocales.count == AssetInventory.maximumReservedLocales { if let oldest = reservedLocales.last { await AssetInventory.release(reservedLocale: oldest) } } do { let status = await AssetInventory.status(forModules: [transcriber]) print("status: \(status)") if let installationRequest = try await AssetInventory.assetInstallationRequest(supporting: [transcriber]) { try await installationRequest.downloadAndInstall() } } ...
Replies
9
Boosts
0
Views
1.2k
Activity
Apr ’26