Explore the integration of media technologies within your app. Discuss working with audio, video, camera, and other media functionalities.

All subtopics
Posts under Media Technologies topic

Post

Replies

Boosts

Views

Activity

Execution breakpoint when trying to play a music library file with AVAudioEngine
Hi all, I'm working on an audio visualizer app that plays files from the user's music library utilizing MediaPlayer and AVAudioEngine. I'm working on getting the music library functionality working before the visualizer aspect. After setting up the engine for file playback, my app inexplicably crashes with an EXC_BREAKPOINT with code = 1. Usually this means I'm unwrapping a nil value, but I think I'm handling the optionals correctly with guard statements. I'm not able to pinpoint where it's crashing. I think it's either in the play function or the setupAudioEngine function. I removed the processAudioBuffer function and my code still crashes the same way, so it's not that. The device that I'm testing this on is running iOS 26 beta 3, although my app is designed for iOS 18 and above. After commenting out code, it seems that the app crashes at the scheduleFile call in the play function, but I'm not fully sure. Here is the setupAudioEngine function: private func setupAudioEngine() { do { try AVAudioSession.sharedInstance().setCategory(.playback, mode: .default) try AVAudioSession.sharedInstance().setActive(true) } catch { print("Audio session error: \(error)") } engine.attach(playerNode) engine.attach(analyzer) engine.connect(playerNode, to: analyzer, format: nil) engine.connect(analyzer, to: engine.mainMixerNode, format: nil) analyzer.installTap(onBus: 0, bufferSize: 1024, format: nil) { [weak self] buffer, _ in self?.processAudioBuffer(buffer) } } Here is the play function: func play(_ mediaItem: MPMediaItem) { guard let assetURL = mediaItem.assetURL else { print("No asset URL for media item") return } stop() do { audioFile = try AVAudioFile(forReading: assetURL) guard let audioFile else { print("Failed to create audio file") return } duration = Double(audioFile.length) / audioFile.fileFormat.sampleRate if !engine.isRunning { try engine.start() } playerNode.scheduleFile(audioFile, at: nil) playerNode.play() DispatchQueue.main.async { [weak self] in self?.isPlaying = true self?.startDisplayLink() } } catch { print("Error playing audio: \(error)") DispatchQueue.main.async { [weak self] in self?.isPlaying = false self?.stopDisplayLink() } } } Here is a link to my test project if you want to try it out for yourself: https://github.com/aabagdi/VisualMan-example Thanks!
8
0
718
Jul ’25
AVAudioSessionCategoryOptionAllowBluetooth incorrectly marked as deprecated in iOS 8 in iOS 26 beta 5
AVAudioSessionCategoryOptionAllowBluetooth is marked as deprecated in iOS 8 in iOS 26 beta 5 when this option was not deprecated in iOS 18.6. I think this is a mistake and the deprecation is in iOS 26. Am I right? It seems that the substitute for this option is "AVAudioSessionCategoryOptionAllowBluetoothHFP". The documentation does not make clear if the behaviour is exactly the same or if any difference should be expected... Has anyone used this option in iOS 26? Should I expect any difference with the current behaviour of "AVAudioSessionCategoryOptionAllowBluetooth"? Thank you.
2
0
361
Aug ’25
AVAssetResourceLoaderDelegate for radio stream
Hi everyone, I’m trying to use AVAssetResourceLoaderDelegate to handle a live radio stream (e.g. Icecast/HTTP stream). My goal is to have access to the last 30 seconds of audio data during playback, so I can analyze it for specific audio patterns in near-real-time. I’ve implemented a custom resource loader that works fine for podcasts and static files, where the file size and content length are known. However, for infinite live streams, my current implementation stops receiving new loading requests after the first one is served. As a result, the playback either stalls or fails to continue. Has anyone successfully used AVAssetResourceLoaderDelegate with a continuous radio stream? Or maybe you can suggest betterapproach for buffering and analyzing live audio? Any tips, examples, or advice would be appreciated. Thanks!
0
0
190
Jun ’25
WWDC25 Camera & Photos group lab summary (Part 3 of 3)
(Note: this is part 3 of a 3 part posting. See Part 1 or Part 2) At WWDC25 we launched a new type of Lab event for the developer community - Group Labs. A Group Lab is a panel Q&A designed for a large audience of developers. Group Labs are a unique opportunity for the community to submit questions directly to a panel of Apple engineers and designers. Here are the highlights from the WWDC25 Group Lab for Camera & Photos. WWDC25 Camera & Photos group lab ran for one hour at 6 PM PST on Tuesday June 10th, 2025 Question 24 What’s the best approach for optimizing barcode scanning using AVFoundation or Vision in low-light or angled scenarios Turn on flash in low-light scenarios Lower framerate to improve exposure and reduce noise Wait until the capture is in focus/notify your user that they need to get closer Question 25 Recent iPhone models introduced macro mode which automatically switch between lenses to take into account of the focal distance difference. Is there official API to implement this, or should I implement them myself using LiDAR values. Using builtInTripleCamera and builtInDualWideCamera will automatically switch to macro when available Question 26 Is there a way to quickly create a thumbnail after the user selects an image with PhotosPicker? File provider API Additional questions from the WWDC25 in-person labs that occurred later in the WWDC week Question 1 When should I build my custom photo picker instead of using the system one? Always start with the system picker -> try embeddable customization APIs -> fallback to custom picker for very special needs Question 2 I'm building a new camera app for pros and I want to give my users the most un-processed image possible, and the most control over the capture as possible. How can I do that with AVCapture? If stills, Brief Bayer RAW capture overview, or Pro RAW if you want Apple's processing and dynamic range If video, talk about prores LOG. Custom exposure settings are available throguh the apis maybe global/local tonemapping discussion?
0
0
426
Jul ’25
Can I Fade Out Track Volume Before End Using ApplicationMusicPlayer?
I’m building a music app using Apple Music streaming via ApplicationMusicPlayer. My goal is to decrease the volume of the current song during the last 10 seconds, and when the next track begins, restore the volume to its normal level. I know that ApplicationMusicPlayer doesn’t expose a volume API, and I want to avoid triggering the system volume HUD. ✅ Using Apple Music streaming (not local files) ❓ Is it possible to implement per-track fade-out/fade-in logic with ApplicationMusicPlayer? Appreciate any clarification or official guidance!
0
0
142
Jun ’25
SpeechTranscriber not providing audioTimeRange for most results
I started playing which transcription of audio files on macOS today, latest beta of Xcode and latest beta of Tahoe. Transcription itself works really well, but for some reason the majority of the results contain no audioTimeRange. I got 22 single-word results with time ranges, spread out all over total file of 53 minutes. Is there something I can do to improve this? To my understanding, I have followed sample code and instructions very closely, but the SwiftTranscriptionSampleApp and other examples I've seen lead me to believe I should be getting a lot more time ranges than I actually do.
3
0
237
Aug ’25
_MediaPlayer_AppIntents compilation error for iOS 26
getting an interesting error attempting to compile my app in Xcode 26 beta. error: Unable to find module dependency: '_MediaPlayer_AppIntents' (in target 'icatcher' from project 'icatcher') note: A dependency of main module 'MainModuleCrossImportOverlays' (in target 'icatcher' from project 'icatcher') Unable to find module dependency: '_MediaPlayer_AppIntents' Not sure what to try and pull to fix this issue
1
1
175
Jun ’25
Couldn't able to hear audio via speaker on ios real device
This is my native module code implementation I'm getting base64 encoded string from server and passing this to my native module of pcm player to play audio App.tsx PcmPlayer.writeChunk(e.data); PcmPlayer.swift import AVFoundation @objc(PcmPlayer) class PcmPlayer: RCTEventEmitter { private var engine: AVAudioEngine? private var playerNode: AVAudioPlayerNode? private var format: AVAudioFormat? private var bufferQueue = [Data]() private var isPlaying = false private var hasEnded = false private var scheduledBufferCount = 0 private let minBufferBytes = 50000 private let pcmQueue = DispatchQueue(label: "pcm.queue") override init() { super.init() } override func supportedEvents() -> [String]! { return ["onStatus", "onMessage"] } @objc(initPlayer:channels:bitsPerSample:) func initPlayer(_ sampleRate: NSNumber, channels: NSNumber, bitsPerSample: NSNumber) { pcmQueue.async { self.stopInternal() let session = AVAudioSession.sharedInstance() do { try session.setCategory(.playback, mode: .default, options: []) try session.setActive(true, options: .notifyOthersOnDeactivation) try session.setMode(.default) print("🔈 Audio session active. Output route:", session.currentRoute.outputs) } catch { print("❌ Audio session setup failed:", error) return } self.engine = AVAudioEngine() self.playerNode = AVAudioPlayerNode() guard let engine = self.engine, let playerNode = self.playerNode else { print("❌ Engine or playerNode is nil") return } engine.attach(playerNode) self.format = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: sampleRate.doubleValue, channels: AVAudioChannelCount(channels.uintValue), interleaved: false) guard let format = self.format else { print("❌ Failed to create AVAudioFormat") return } engine.connect(playerNode, to: engine.mainMixerNode, format: format) do { try engine.start() playerNode.play() engine.mainMixerNode.outputVolume = 1.0 print("✅ AVAudioEngine started with format:", format) } catch { print("❌ Engine start failed:", error) } self.hasEnded = false } } @objc(writeChunk:) func writeChunk(_ base64Pcm: String) { pcmQueue.async { guard base64Pcm.count >= 10 else { print("⚠️ Skipping short base64 string") return } var padded = base64Pcm let mod4 = base64Pcm.count % 4 if mod4 > 0 { padded += String(repeating: "=", count: 4 - mod4) } guard let data = Data(base64Encoded: padded, options: .ignoreUnknownCharacters) else { print("❌ Failed to decode base64") return } self.bufferQueue.append(data) print("📥 Received PCM chunk (\(data.count) bytes)") print("📥 writeChunk called. isPlaying=\(self.isPlaying), bufferQueue.count=\(self.bufferQueue.count)") if !self.isPlaying { self.isPlaying = true self.waitForBufferAndStartPlayback() } else if self.scheduledBufferCount == 0 { self.isPlaying = true self.waitForBufferAndStartPlayback() } } } private func waitForBufferAndStartPlayback() { DispatchQueue.global().async { while self.queueSize() < self.minBufferBytes && !self.hasEnded { Thread.sleep(forTimeInterval: 0.01) } self.writeLoop() } } private func writeLoop() { DispatchQueue.global().async { writeLoop: while self.isPlaying { if self.bufferQueue.isEmpty { for _ in 0..<100 { Thread.sleep(forTimeInterval: 0.01) if !self.bufferQueue.isEmpty { break } } if self.bufferQueue.isEmpty { print("🔇 No more data to play after waiting") self.isPlaying = false break writeLoop } } var data: Data? self.pcmQueue.sync { if !self.bufferQueue.isEmpty { data = self.bufferQueue.removeFirst() } } guard let chunk = data else { print("⚠️ No data to process") continue } if let buffer = self.pcmBufferFromData(chunk) { self.scheduledBufferCount += 1 self.playerNode?.scheduleBuffer(buffer, completionHandler: { self.pcmQueue.async { self.scheduledBufferCount -= 1 if self.bufferQueue.isEmpty && self.scheduledBufferCount == 0 { print("ℹ️ Playback idle - waiting for more data") self.isPlaying = false } } }) } } } } private func pcmBufferFromData(_ data: Data) -> AVAudioPCMBuffer? { guard let format = self.format else { return nil } let frameCount = UInt32(data.count / 2) guard let buffer = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: frameCount) else { print("❌ Failed to create AVAudioPCMBuffer") return nil } buffer.frameLength = frameCount guard let floatChannelData = buffer.floatChannelData?[0] else { print("❌ floatChannelData is nil") return nil } data.withUnsafeBytes { (rawBuffer: UnsafeRawBufferPointer) in let int16Buffer = rawBuffer.bindMemory(to: Int16.self) let count = min(int16Buffer.count, Int(frameCount)) for i in 0..<count { floatChannelData[i] = Float32(int16Buffer[i]) / Float32(Int16.max) } } return buffer } @objc(stopPlayer) func stopPlayer() { pcmQueue.async { self.stopInternal() } } private func stopInternal() { print("🛑 stopInternal called") self.playerNode?.stop() self.engine?.stop() self.engine?.reset() self.playerNode = nil self.engine = nil self.format = nil self.bufferQueue.removeAll() self.isPlaying = false self.hasEnded = true self.scheduledBufferCount = 0 } @objc(canWrite:rejecter:) func canWrite(_ resolve: @escaping RCTPromiseResolveBlock, rejecter reject: RCTPromiseRejectBlock) { pcmQueue.async { resolve(self.bufferQueue.count < 20) } } @objc(flushPlayer:rejecter:) func flushPlayer(_ resolve: @escaping RCTPromiseResolveBlock, rejecter reject: RCTPromiseRejectBlock) { pcmQueue.async { self.bufferQueue.removeAll() resolve(nil) } } @objc static override func requiresMainQueueSetup() -> Bool { return false } private func queueSize() -> Int { return pcmQueue.sync { return self.bufferQueue.reduce(0) { $0 + $1.count } } } } I couldn't able to hear any audio via my real iOS device also it is working fine on emulator.
0
0
251
Jul ’25
Apple Music API High Error Rate
I am getting high error rates from the Apple Music API. This has been happening for months now, and it is quite frustrating. It is a mix of 404, 504, and random 500 errors. I hit these endpoints all of the time, so it is not like I am hitting a resource that doesn't exist. Why is this happening? Is this a known issue that is getting worked on?
0
0
150
Jun ’25
occasional glitches and empty buffers when using AudioFileStream + AVAudioConverter
I'm streaming mp3 audio data using URLSession/AudioFileStream/AVAudioConverter and getting occasional silent buffers and glitches (little bleeps and whoops as opposed to clicks). The issues are present in an offline test, so this isn't an issue of underruns. Doing some buffering on the input coming from the URLSession (URLSessionDataTask) reduces the glitches/silent buffers to rather infrequent, but they do still happen occasionally. var bufferedData = Data() func parseBytes(data: Data) { bufferedData.append(data) // XXX: this buffering reduces glitching // to rather infrequent. But why? if bufferedData.count > 32768 { bufferedData.withUnsafeBytes { (bytes: UnsafeRawBufferPointer) in guard let baseAddress = bytes.baseAddress else { return } let result = AudioFileStreamParseBytes(audioStream!, UInt32(bufferedData.count), baseAddress, []) if result != noErr { print("❌ error parsing stream: \(result)") } } bufferedData = Data() } } No errors are returned by AudioFileStream or AVAudioConverter. func handlePackets(data: Data, packetDescriptions: [AudioStreamPacketDescription]) { guard let audioConverter else { return } var maxPacketSize: UInt32 = 0 for packetDescription in packetDescriptions { maxPacketSize = max(maxPacketSize, packetDescription.mDataByteSize) if packetDescription.mDataByteSize == 0 { print("EMPTY PACKET") } if Int(packetDescription.mStartOffset) + Int(packetDescription.mDataByteSize) > data.count { print("❌ Invalid packet: offset \(packetDescription.mStartOffset) + size \(packetDescription.mDataByteSize) > data.count \(data.count)") } } let bufferIn = AVAudioCompressedBuffer(format: inFormat!, packetCapacity: AVAudioPacketCount(packetDescriptions.count), maximumPacketSize: Int(maxPacketSize)) bufferIn.byteLength = UInt32(data.count) for i in 0 ..< Int(packetDescriptions.count) { bufferIn.packetDescriptions![i] = packetDescriptions[i] } bufferIn.packetCount = AVAudioPacketCount(packetDescriptions.count) _ = data.withUnsafeBytes { ptr in memcpy(bufferIn.data, ptr.baseAddress, data.count) } if verbose { print("handlePackets: \(data.count) bytes") } // Setup input provider closure var inputProvided = false let inputBlock: AVAudioConverterInputBlock = { packetCount, statusPtr in if !inputProvided { inputProvided = true statusPtr.pointee = .haveData return bufferIn } else { statusPtr.pointee = .noDataNow return nil } } // Loop until converter runs dry or is done while true { let bufferOut = AVAudioPCMBuffer(pcmFormat: outFormat, frameCapacity: 4096)! bufferOut.frameLength = 0 var error: NSError? let status = audioConverter.convert(to: bufferOut, error: &error, withInputFrom: inputBlock) switch status { case .haveData: if verbose { print("✅ convert returned haveData: \(bufferOut.frameLength) frames") } if bufferOut.frameLength > 0 { if bufferOut.isSilent { print("(haveData) SILENT BUFFER at frame \(totalFrames), pending: \(pendingFrames), inputPackets=\(bufferIn.packetCount), outputFrames=\(bufferOut.frameLength)") } outBuffers.append(bufferOut) totalFrames += Int(bufferOut.frameLength) } case .inputRanDry: if verbose { print("🔁 convert returned inputRanDry: \(bufferOut.frameLength) frames") } if bufferOut.frameLength > 0 { if bufferOut.isSilent { print("(inputRanDry) SILENT BUFFER at frame \(totalFrames), pending: \(pendingFrames), inputPackets=\(bufferIn.packetCount), outputFrames=\(bufferOut.frameLength)") } outBuffers.append(bufferOut) totalFrames += Int(bufferOut.frameLength) } return // wait for next handlePackets case .endOfStream: if verbose { print("✅ convert returned endOfStream") } return case .error: if verbose { print("❌ convert returned error") } if let error = error { print("error converting: \(error.localizedDescription)") } return @unknown default: fatalError() } } }
0
0
589
Jul ’25
In Speech framework is SFTranscriptionSegment timing supposed to be off and speechRecognitionMetadata nil until isFinal?
I'm working in Swift/SwiftUI, running XCode 16.3 on macOS 15.4 and I've seen this when running in the iOS simulator and in a macOS app run from XCode. I've also seen this behaviour with 3 different audio files. Nothing in the documentation says that the speechRecognitionMetadata property on an SFSpeechRecognitionResult will be nil until isFinal, but that's the behaviour I'm seeing. I've stripped my class down to the following: private var isAuthed = false // I call this in a .task {} in my SwiftUI View public func requestSpeechRecognizerPermission() { SFSpeechRecognizer.requestAuthorization { authStatus in Task { self.isAuthed = authStatus == .authorized } } } public func transcribe(from url: URL) { guard isAuthed else { return } let locale = Locale(identifier: "en-US") let recognizer = SFSpeechRecognizer(locale: locale) let recognitionRequest = SFSpeechURLRecognitionRequest(url: url) // the behaviour occurs whether I set this to true or not, I recently set // it to true to see if it made a difference recognizer?.supportsOnDeviceRecognition = true recognitionRequest.shouldReportPartialResults = true recognitionRequest.addsPunctuation = true recognizer?.recognitionTask(with: recognitionRequest) { (result, error) in guard result != nil else { return } if result!.isFinal { //speechRecognitionMetadata is not nil } else { //speechRecognitionMetadata is nil } } } } Further, and this isn't documented either, the SFTranscriptionSegment values don't have correct timestamp and duration values until isFinal. The values aren't all zero, but they don't align with the timing in the audio and they change to accurate values when isFinal is true. The transcription otherwise "works", in that I get transcription text before isFinal and if I wait for isFinal the segments are correct and speechRecognitionMetadata is filled with values. The context here is I'm trying to generate a transcription that I can then highlight the spoken sections of as audio plays and I'm thinking I must be just trying to use the Speech framework in a way it does not work. I got my concept working if I pre-process the audio (i.e. run it through until isFinal and save the results I need to json), but being able to do even a rougher version of it 'on the fly' - which requires segments to have the right timestamp/duration before isFinal - is perhaps impossible?
1
0
197
Jul ’25
Delay in Microphone Input When Talking While Receiving Audio in PTT Framework (Full Duplex Mode)
Context: I am currently developing an app using the Push-to-Talk (PTT) framework. I have reviewed both the PTT framework documentation and the CallKit demo project to better understand how to properly manage audio session activation and AVAudioEngine setup. I am not activating the audio session manually. The audio session configuration is handled in the incomingPushResult or didBeginTransmitting callbacks from the PTChannelManagerDelegate. I am using a single AVAudioEngine instance for both input and playback. The engine is started in the didActivate callback from the PTChannelManagerDelegate. When I receive a push in full duplex mode, I set the active participant to the user who is speaking. Issue When I attempt to talk while the other participant is already speaking, my input tap on the input node takes a few seconds to return valid PCM audio data. Initially, it returns an empty PCM audio block. Details: The audio session is already active and configured with .playAndRecord. The input tap is already installed when the engine is started. When I talk from a neutral state (no one is speaking), the system plays the standard "microphone activation" tone, which covers this initial delay. However, this does not happen when I am already receiving audio. Assumptions / Current Setup Because the audio session is active in play and record, I assumed that microphone input would be available immediately, even while receiving audio. However, there seems to be a delay before valid input is delivered to the tap, only occurring when switching from a receive state to simultaneously talking. Questions Is this expected behavior when using the PTT framework in full duplex mode with a shared AVAudioEngine? Should I be restarting or reconfiguring the engine or audio session when beginning to talk while receiving audio? Is there a recommended pattern for managing microphone readiness in this scenario to avoid the initial empty PCM buffer? Would using separate engines for input and output improve responsiveness? I would like to confirm the correct approach to handling simultaneous talk and receive in full duplex mode using PTT framework and AVAudioEngine. Specifically, I need guidance on ensuring the microphone is ready to capture audio immediately without the delay seen in my current implementation. Relevant Code Snippets Engine Setup func setup() { let input = audioEngine.inputNode do { try input.setVoiceProcessingEnabled(true) } catch { print("Could not enable voice processing \(error)") return } input.isVoiceProcessingAGCEnabled = false let output = audioEngine.outputNode let mainMixer = audioEngine.mainMixerNode audioEngine.connect(pttPlayerNode, to: mainMixer, format: outputFormat) audioEngine.connect(beepNode, to: mainMixer, format: outputFormat) audioEngine.connect(mainMixer, to: output, format: outputFormat) // Initialize converters converter = AVAudioConverter(from: inputFormat, to: outputFormat)! f32ToInt16Converter = AVAudioConverter(from: outputFormat, to: inputFormat)! audioEngine.prepare() } Input Tap Installation func installTap() { guard AudioHandler.shared.checkMicrophonePermission() else { print("Microphone not granted for recording") return } guard !isInputTapped else { print("[AudioEngine] Input is already tapped!") return } let input = audioEngine.inputNode let microphoneFormat = input.inputFormat(forBus: 0) let microphoneDownsampler = AVAudioConverter(from: microphoneFormat, to: outputFormat)! let desiredFormat = outputFormat let inputFramesNeeded = AVAudioFrameCount((Double(OpusCodec.DECODED_PACKET_NUM_SAMPLES) * microphoneFormat.sampleRate) / desiredFormat.sampleRate) input.installTap(onBus: 0, bufferSize: inputFramesNeeded, format: input.inputFormat(forBus: 0)) { [weak self] buffer, when in guard let self = self else { return } // Output buffer: 1920 frames at 16kHz guard let outputBuffer = AVAudioPCMBuffer(pcmFormat: desiredFormat, frameCapacity: AVAudioFrameCount(OpusCodec.DECODED_PACKET_NUM_SAMPLES)) else { return } outputBuffer.frameLength = outputBuffer.frameCapacity let inputBlock: AVAudioConverterInputBlock = { inNumPackets, outStatus in outStatus.pointee = .haveData return buffer } var error: NSError? let converterResult = microphoneDownsampler.convert(to: outputBuffer, error: &error, withInputFrom: inputBlock) if converterResult != .haveData { DebugLogger.shared.print("Downsample error \(converterResult)") } else { self.handleDownsampledBuffer(outputBuffer) } } isInputTapped = true }
4
0
539
Aug ’25
AirPlay v1 is broken in iOS 18.4?
After upgrading to iOS 18.4, I'm no longer able to establish an AirPlay v1 connection to an audio system. The symptom is that the AirPlay route picker just spins when trying to connect to an audio system. It eventually gives up. I tested this on an iPhone 14, connecting to a HomePod, AirPort express, AppleTV and a Wiim Pro. If I try connecting with AirPlay v2, ex: using Apple Music, the connection succeeds and audio can be played. I'm the developer of an app that plays audio over AirPlay while also recording. My app has to use AirPlay v1 because AvAudioSession doesn't allow the policy .longFormAudio when the category is .playAndRecord. This issue is a real pain as it means my app is suddenly broken for many thousands of users. Is anyone else seeing this issue? Any suggestions for a workaround?
2
3
676
Jun ’25
AVAudioUnit host - PCM buffer output silent
Hi, I just started to develop audio unit hosting support in my application. Offline rendering seems to work except that I hear no output, but why? I suspect with the player goes something wrong. I connect to CoreAudio in a different location in the code. Here are some error messages I faced so far: 2025-08-14 19:42:04.132930+0200 com.gsequencer.GSequencer[34358:18611871] [avae] AVAudioEngineGraph.mm:4668 Can't retrieve source node to play sequence because there is no output node! 2025-08-14 19:42:04.151171+0200 com.gsequencer.GSequencer[34358:18611871] [avae] AVAudioEngineGraph.mm:4668 Can't retrieve source node to play sequence because there is no output node! 2025-08-14 19:43:08.344530+0200 com.gsequencer.GSequencer[34358:18614927] AUAudioUnit.mm:1417 Cannot set maximumFramesToRender while render resources allocated. 2025-08-14 19:43:08.346583+0200 com.gsequencer.GSequencer[34358:18614927] [avae] AVAEInternal.h:104 [AVAudioSequencer.mm:121:-[AVAudioSequencer(AVAudioSequencer_Player) startAndReturnError:]: (impl->Start()): error -10852 ** (<unknown>:34358): WARNING **: 19:43:08.346: error during audio sequencer start - -10852 I have implemented an AVAudioEngine based AudioUnit host. Here I instantiate player and effect: /* audio engine */ audio_engine = [[AVAudioEngine alloc] init]; fx_audio_unit_audio->audio_engine = (gpointer) audio_engine; av_format = (AVAudioFormat *) fx_audio_unit_audio->av_format; /* av audio player node */ av_audio_player_node = [[AVAudioPlayerNode alloc] init]; /* av audio unit */ av_audio_unit_effect = [[AVAudioUnitEffect alloc] initWithAudioComponentDescription:[((AVAudioUnitComponent *) AGS_AUDIO_UNIT_PLUGIN(base_plugin)->component) audioComponentDescription]]; av_audio_unit = (AVAudioUnit *) av_audio_unit_effect; fx_audio_unit_audio->av_audio_unit = av_audio_unit; /* audio sequencer */ av_audio_sequencer = [[AVAudioSequencer alloc] initWithAudioEngine:audio_engine]; fx_audio_unit_audio->av_audio_sequencer = (gpointer) av_audio_sequencer; /* output node */ [[AVAudioOutputNode alloc] init]; /* audio player and audio unit */ [audio_engine attachNode:av_audio_player_node]; [audio_engine attachNode:av_audio_unit]; [audio_engine connect:av_audio_player_node to:av_audio_unit format:av_format]; [audio_engine connect:av_audio_unit to:[audio_engine outputNode] format:av_format]; ns_error = NULL; [audio_engine enableManualRenderingMode:AVAudioEngineManualRenderingModeOffline format:av_format maximumFrameCount:buffer_size error:&ns_error]; if(ns_error != NULL && [ns_error code] != noErr){ g_warning("enable manual rendering mode error - %d", [ns_error code]); } ns_error = NULL; [[av_audio_unit AUAudioUnit] allocateRenderResourcesAndReturnError:&ns_error]; if(ns_error != NULL && [ns_error code] != noErr){ g_warning("Audio Unit allocate render resources returned error - ErrorCode %d", [ns_error code]); } Then I render in a dedicated thread. ns_error = NULL; [audio_engine startAndReturnError:&ns_error]; if(ns_error != NULL && [ns_error code] != noErr){ g_warning("error during audio engine start - %d", [ns_error code]); } [av_audio_sequencer prepareToPlay]; ns_error = NULL; [av_audio_sequencer startAndReturnError:&ns_error]; if(ns_error != NULL && [ns_error code] != noErr){ g_warning("error during audio sequencer start - %d", [ns_error code]); } [av_audio_player_node play]; while(is_running){ /* pre sync */ /* IO buffers */ av_output_buffer = (AVAudioPCMBuffer *) scope_data->av_output_buffer; av_input_buffer = (AVAudioPCMBuffer *) scope_data->av_input_buffer; /* fill input buffer */ /* schedule av input buffer */ frame_position = 0; // (gint64) ((note_offset * absolute_delay) + delay_counter) * buffer_size; av_audio_player_node = (AVAudioPlayerNode *) fx_audio_unit_audio->av_audio_player_node; AVAudioTime *av_audio_time = [[AVAudioTime alloc] initWithHostTime:frame_position sampleTime:frame_position atRate:((double) samplerate)]; [av_audio_player_node scheduleBuffer:av_input_buffer atTime:av_audio_time options:0 completionHandler:nil]; /* render */ ns_error = NULL; status = [audio_engine renderOffline:AGS_FX_AUDIO_UNIT_AUDIO_FIXED_BUFFER_SIZE toBuffer:av_output_buffer error:&ns_error]; if(ns_error != NULL && [ns_error code] != noErr){ g_warning("render offline error - %d", [ns_error code]); } } regards, Joël
3
0
555
Aug ’25
Execution breakpoint when trying to play a music library file with AVAudioEngine
Hi all, I'm working on an audio visualizer app that plays files from the user's music library utilizing MediaPlayer and AVAudioEngine. I'm working on getting the music library functionality working before the visualizer aspect. After setting up the engine for file playback, my app inexplicably crashes with an EXC_BREAKPOINT with code = 1. Usually this means I'm unwrapping a nil value, but I think I'm handling the optionals correctly with guard statements. I'm not able to pinpoint where it's crashing. I think it's either in the play function or the setupAudioEngine function. I removed the processAudioBuffer function and my code still crashes the same way, so it's not that. The device that I'm testing this on is running iOS 26 beta 3, although my app is designed for iOS 18 and above. After commenting out code, it seems that the app crashes at the scheduleFile call in the play function, but I'm not fully sure. Here is the setupAudioEngine function: private func setupAudioEngine() { do { try AVAudioSession.sharedInstance().setCategory(.playback, mode: .default) try AVAudioSession.sharedInstance().setActive(true) } catch { print("Audio session error: \(error)") } engine.attach(playerNode) engine.attach(analyzer) engine.connect(playerNode, to: analyzer, format: nil) engine.connect(analyzer, to: engine.mainMixerNode, format: nil) analyzer.installTap(onBus: 0, bufferSize: 1024, format: nil) { [weak self] buffer, _ in self?.processAudioBuffer(buffer) } } Here is the play function: func play(_ mediaItem: MPMediaItem) { guard let assetURL = mediaItem.assetURL else { print("No asset URL for media item") return } stop() do { audioFile = try AVAudioFile(forReading: assetURL) guard let audioFile else { print("Failed to create audio file") return } duration = Double(audioFile.length) / audioFile.fileFormat.sampleRate if !engine.isRunning { try engine.start() } playerNode.scheduleFile(audioFile, at: nil) playerNode.play() DispatchQueue.main.async { [weak self] in self?.isPlaying = true self?.startDisplayLink() } } catch { print("Error playing audio: \(error)") DispatchQueue.main.async { [weak self] in self?.isPlaying = false self?.stopDisplayLink() } } } Here is a link to my test project if you want to try it out for yourself: https://github.com/aabagdi/VisualMan-example Thanks!
Replies
8
Boosts
0
Views
718
Activity
Jul ’25
AVAudioSessionCategoryOptionAllowBluetooth incorrectly marked as deprecated in iOS 8 in iOS 26 beta 5
AVAudioSessionCategoryOptionAllowBluetooth is marked as deprecated in iOS 8 in iOS 26 beta 5 when this option was not deprecated in iOS 18.6. I think this is a mistake and the deprecation is in iOS 26. Am I right? It seems that the substitute for this option is "AVAudioSessionCategoryOptionAllowBluetoothHFP". The documentation does not make clear if the behaviour is exactly the same or if any difference should be expected... Has anyone used this option in iOS 26? Should I expect any difference with the current behaviour of "AVAudioSessionCategoryOptionAllowBluetooth"? Thank you.
Replies
2
Boosts
0
Views
361
Activity
Aug ’25
AVAssetResourceLoaderDelegate for radio stream
Hi everyone, I’m trying to use AVAssetResourceLoaderDelegate to handle a live radio stream (e.g. Icecast/HTTP stream). My goal is to have access to the last 30 seconds of audio data during playback, so I can analyze it for specific audio patterns in near-real-time. I’ve implemented a custom resource loader that works fine for podcasts and static files, where the file size and content length are known. However, for infinite live streams, my current implementation stops receiving new loading requests after the first one is served. As a result, the playback either stalls or fails to continue. Has anyone successfully used AVAssetResourceLoaderDelegate with a continuous radio stream? Or maybe you can suggest betterapproach for buffering and analyzing live audio? Any tips, examples, or advice would be appreciated. Thanks!
Replies
0
Boosts
0
Views
190
Activity
Jun ’25
When saving a 48MP image, CGImageDestinationFinalize consumes a large amount of memory. Are there any optimization methods available?
it will use about 300MB memory, it cause a memory peak
Replies
1
Boosts
0
Views
520
Activity
Jul ’25
WWDC25 Camera & Photos group lab summary (Part 3 of 3)
(Note: this is part 3 of a 3 part posting. See Part 1 or Part 2) At WWDC25 we launched a new type of Lab event for the developer community - Group Labs. A Group Lab is a panel Q&A designed for a large audience of developers. Group Labs are a unique opportunity for the community to submit questions directly to a panel of Apple engineers and designers. Here are the highlights from the WWDC25 Group Lab for Camera & Photos. WWDC25 Camera & Photos group lab ran for one hour at 6 PM PST on Tuesday June 10th, 2025 Question 24 What’s the best approach for optimizing barcode scanning using AVFoundation or Vision in low-light or angled scenarios Turn on flash in low-light scenarios Lower framerate to improve exposure and reduce noise Wait until the capture is in focus/notify your user that they need to get closer Question 25 Recent iPhone models introduced macro mode which automatically switch between lenses to take into account of the focal distance difference. Is there official API to implement this, or should I implement them myself using LiDAR values. Using builtInTripleCamera and builtInDualWideCamera will automatically switch to macro when available Question 26 Is there a way to quickly create a thumbnail after the user selects an image with PhotosPicker? File provider API Additional questions from the WWDC25 in-person labs that occurred later in the WWDC week Question 1 When should I build my custom photo picker instead of using the system one? Always start with the system picker -> try embeddable customization APIs -> fallback to custom picker for very special needs Question 2 I'm building a new camera app for pros and I want to give my users the most un-processed image possible, and the most control over the capture as possible. How can I do that with AVCapture? If stills, Brief Bayer RAW capture overview, or Pro RAW if you want Apple's processing and dynamic range If video, talk about prores LOG. Custom exposure settings are available throguh the apis maybe global/local tonemapping discussion?
Replies
0
Boosts
0
Views
426
Activity
Jul ’25
Can I Fade Out Track Volume Before End Using ApplicationMusicPlayer?
I’m building a music app using Apple Music streaming via ApplicationMusicPlayer. My goal is to decrease the volume of the current song during the last 10 seconds, and when the next track begins, restore the volume to its normal level. I know that ApplicationMusicPlayer doesn’t expose a volume API, and I want to avoid triggering the system volume HUD. ✅ Using Apple Music streaming (not local files) ❓ Is it possible to implement per-track fade-out/fade-in logic with ApplicationMusicPlayer? Appreciate any clarification or official guidance!
Replies
0
Boosts
0
Views
142
Activity
Jun ’25
SpeechTranscriber not providing audioTimeRange for most results
I started playing which transcription of audio files on macOS today, latest beta of Xcode and latest beta of Tahoe. Transcription itself works really well, but for some reason the majority of the results contain no audioTimeRange. I got 22 single-word results with time ranges, spread out all over total file of 53 minutes. Is there something I can do to improve this? To my understanding, I have followed sample code and instructions very closely, but the SwiftTranscriptionSampleApp and other examples I've seen lead me to believe I should be getting a lot more time ranges than I actually do.
Replies
3
Boosts
0
Views
237
Activity
Aug ’25
_MediaPlayer_AppIntents compilation error for iOS 26
getting an interesting error attempting to compile my app in Xcode 26 beta. error: Unable to find module dependency: '_MediaPlayer_AppIntents' (in target 'icatcher' from project 'icatcher') note: A dependency of main module 'MainModuleCrossImportOverlays' (in target 'icatcher' from project 'icatcher') Unable to find module dependency: '_MediaPlayer_AppIntents' Not sure what to try and pull to fix this issue
Replies
1
Boosts
1
Views
175
Activity
Jun ’25
AVCaptureMetadataOutput .face Not Triggering on iOS 26 – Regression or Intended Change?
Issue: In iOS 26 (tested on Developer Beta), AVCaptureMetadataOutputObjectsDelegate no longer receives callbacks when using .face detection. metadataOutput.metadataObjectTypes = [.face]
Replies
1
Boosts
7
Views
400
Activity
Jun ’25
AutoMix Api Available in MusicKit
Is there any way for me to use an AutoMix api in my IOS apps, I would play tracks using the Apple Music api and use AutoMix to attempt to merge tracks. Is this feature/api available to developers.
Replies
0
Boosts
0
Views
155
Activity
Jun ’25
React website working properly in Android but not in Iphone
We have a React website build to scan qr codes. The website is properly working for Android devices but for Iphone we see a camera glitch causing delay in scan which is unexpected. Website URL : https://react-qr-code-scanner-app.vercel.app/
Replies
0
Boosts
0
Views
60
Activity
Jul ’25
Couldn't able to hear audio via speaker on ios real device
This is my native module code implementation I'm getting base64 encoded string from server and passing this to my native module of pcm player to play audio App.tsx PcmPlayer.writeChunk(e.data); PcmPlayer.swift import AVFoundation @objc(PcmPlayer) class PcmPlayer: RCTEventEmitter { private var engine: AVAudioEngine? private var playerNode: AVAudioPlayerNode? private var format: AVAudioFormat? private var bufferQueue = [Data]() private var isPlaying = false private var hasEnded = false private var scheduledBufferCount = 0 private let minBufferBytes = 50000 private let pcmQueue = DispatchQueue(label: "pcm.queue") override init() { super.init() } override func supportedEvents() -> [String]! { return ["onStatus", "onMessage"] } @objc(initPlayer:channels:bitsPerSample:) func initPlayer(_ sampleRate: NSNumber, channels: NSNumber, bitsPerSample: NSNumber) { pcmQueue.async { self.stopInternal() let session = AVAudioSession.sharedInstance() do { try session.setCategory(.playback, mode: .default, options: []) try session.setActive(true, options: .notifyOthersOnDeactivation) try session.setMode(.default) print("🔈 Audio session active. Output route:", session.currentRoute.outputs) } catch { print("❌ Audio session setup failed:", error) return } self.engine = AVAudioEngine() self.playerNode = AVAudioPlayerNode() guard let engine = self.engine, let playerNode = self.playerNode else { print("❌ Engine or playerNode is nil") return } engine.attach(playerNode) self.format = AVAudioFormat(commonFormat: .pcmFormatFloat32, sampleRate: sampleRate.doubleValue, channels: AVAudioChannelCount(channels.uintValue), interleaved: false) guard let format = self.format else { print("❌ Failed to create AVAudioFormat") return } engine.connect(playerNode, to: engine.mainMixerNode, format: format) do { try engine.start() playerNode.play() engine.mainMixerNode.outputVolume = 1.0 print("✅ AVAudioEngine started with format:", format) } catch { print("❌ Engine start failed:", error) } self.hasEnded = false } } @objc(writeChunk:) func writeChunk(_ base64Pcm: String) { pcmQueue.async { guard base64Pcm.count >= 10 else { print("⚠️ Skipping short base64 string") return } var padded = base64Pcm let mod4 = base64Pcm.count % 4 if mod4 > 0 { padded += String(repeating: "=", count: 4 - mod4) } guard let data = Data(base64Encoded: padded, options: .ignoreUnknownCharacters) else { print("❌ Failed to decode base64") return } self.bufferQueue.append(data) print("📥 Received PCM chunk (\(data.count) bytes)") print("📥 writeChunk called. isPlaying=\(self.isPlaying), bufferQueue.count=\(self.bufferQueue.count)") if !self.isPlaying { self.isPlaying = true self.waitForBufferAndStartPlayback() } else if self.scheduledBufferCount == 0 { self.isPlaying = true self.waitForBufferAndStartPlayback() } } } private func waitForBufferAndStartPlayback() { DispatchQueue.global().async { while self.queueSize() < self.minBufferBytes && !self.hasEnded { Thread.sleep(forTimeInterval: 0.01) } self.writeLoop() } } private func writeLoop() { DispatchQueue.global().async { writeLoop: while self.isPlaying { if self.bufferQueue.isEmpty { for _ in 0..<100 { Thread.sleep(forTimeInterval: 0.01) if !self.bufferQueue.isEmpty { break } } if self.bufferQueue.isEmpty { print("🔇 No more data to play after waiting") self.isPlaying = false break writeLoop } } var data: Data? self.pcmQueue.sync { if !self.bufferQueue.isEmpty { data = self.bufferQueue.removeFirst() } } guard let chunk = data else { print("⚠️ No data to process") continue } if let buffer = self.pcmBufferFromData(chunk) { self.scheduledBufferCount += 1 self.playerNode?.scheduleBuffer(buffer, completionHandler: { self.pcmQueue.async { self.scheduledBufferCount -= 1 if self.bufferQueue.isEmpty && self.scheduledBufferCount == 0 { print("ℹ️ Playback idle - waiting for more data") self.isPlaying = false } } }) } } } } private func pcmBufferFromData(_ data: Data) -> AVAudioPCMBuffer? { guard let format = self.format else { return nil } let frameCount = UInt32(data.count / 2) guard let buffer = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: frameCount) else { print("❌ Failed to create AVAudioPCMBuffer") return nil } buffer.frameLength = frameCount guard let floatChannelData = buffer.floatChannelData?[0] else { print("❌ floatChannelData is nil") return nil } data.withUnsafeBytes { (rawBuffer: UnsafeRawBufferPointer) in let int16Buffer = rawBuffer.bindMemory(to: Int16.self) let count = min(int16Buffer.count, Int(frameCount)) for i in 0..<count { floatChannelData[i] = Float32(int16Buffer[i]) / Float32(Int16.max) } } return buffer } @objc(stopPlayer) func stopPlayer() { pcmQueue.async { self.stopInternal() } } private func stopInternal() { print("🛑 stopInternal called") self.playerNode?.stop() self.engine?.stop() self.engine?.reset() self.playerNode = nil self.engine = nil self.format = nil self.bufferQueue.removeAll() self.isPlaying = false self.hasEnded = true self.scheduledBufferCount = 0 } @objc(canWrite:rejecter:) func canWrite(_ resolve: @escaping RCTPromiseResolveBlock, rejecter reject: RCTPromiseRejectBlock) { pcmQueue.async { resolve(self.bufferQueue.count < 20) } } @objc(flushPlayer:rejecter:) func flushPlayer(_ resolve: @escaping RCTPromiseResolveBlock, rejecter reject: RCTPromiseRejectBlock) { pcmQueue.async { self.bufferQueue.removeAll() resolve(nil) } } @objc static override func requiresMainQueueSetup() -> Bool { return false } private func queueSize() -> Int { return pcmQueue.sync { return self.bufferQueue.reduce(0) { $0 + $1.count } } } } I couldn't able to hear any audio via my real iOS device also it is working fine on emulator.
Replies
0
Boosts
0
Views
251
Activity
Jul ’25
Apple Music API High Error Rate
I am getting high error rates from the Apple Music API. This has been happening for months now, and it is quite frustrating. It is a mix of 404, 504, and random 500 errors. I hit these endpoints all of the time, so it is not like I am hitting a resource that doesn't exist. Why is this happening? Is this a known issue that is getting worked on?
Replies
0
Boosts
0
Views
150
Activity
Jun ’25
occasional glitches and empty buffers when using AudioFileStream + AVAudioConverter
I'm streaming mp3 audio data using URLSession/AudioFileStream/AVAudioConverter and getting occasional silent buffers and glitches (little bleeps and whoops as opposed to clicks). The issues are present in an offline test, so this isn't an issue of underruns. Doing some buffering on the input coming from the URLSession (URLSessionDataTask) reduces the glitches/silent buffers to rather infrequent, but they do still happen occasionally. var bufferedData = Data() func parseBytes(data: Data) { bufferedData.append(data) // XXX: this buffering reduces glitching // to rather infrequent. But why? if bufferedData.count > 32768 { bufferedData.withUnsafeBytes { (bytes: UnsafeRawBufferPointer) in guard let baseAddress = bytes.baseAddress else { return } let result = AudioFileStreamParseBytes(audioStream!, UInt32(bufferedData.count), baseAddress, []) if result != noErr { print("❌ error parsing stream: \(result)") } } bufferedData = Data() } } No errors are returned by AudioFileStream or AVAudioConverter. func handlePackets(data: Data, packetDescriptions: [AudioStreamPacketDescription]) { guard let audioConverter else { return } var maxPacketSize: UInt32 = 0 for packetDescription in packetDescriptions { maxPacketSize = max(maxPacketSize, packetDescription.mDataByteSize) if packetDescription.mDataByteSize == 0 { print("EMPTY PACKET") } if Int(packetDescription.mStartOffset) + Int(packetDescription.mDataByteSize) > data.count { print("❌ Invalid packet: offset \(packetDescription.mStartOffset) + size \(packetDescription.mDataByteSize) > data.count \(data.count)") } } let bufferIn = AVAudioCompressedBuffer(format: inFormat!, packetCapacity: AVAudioPacketCount(packetDescriptions.count), maximumPacketSize: Int(maxPacketSize)) bufferIn.byteLength = UInt32(data.count) for i in 0 ..< Int(packetDescriptions.count) { bufferIn.packetDescriptions![i] = packetDescriptions[i] } bufferIn.packetCount = AVAudioPacketCount(packetDescriptions.count) _ = data.withUnsafeBytes { ptr in memcpy(bufferIn.data, ptr.baseAddress, data.count) } if verbose { print("handlePackets: \(data.count) bytes") } // Setup input provider closure var inputProvided = false let inputBlock: AVAudioConverterInputBlock = { packetCount, statusPtr in if !inputProvided { inputProvided = true statusPtr.pointee = .haveData return bufferIn } else { statusPtr.pointee = .noDataNow return nil } } // Loop until converter runs dry or is done while true { let bufferOut = AVAudioPCMBuffer(pcmFormat: outFormat, frameCapacity: 4096)! bufferOut.frameLength = 0 var error: NSError? let status = audioConverter.convert(to: bufferOut, error: &error, withInputFrom: inputBlock) switch status { case .haveData: if verbose { print("✅ convert returned haveData: \(bufferOut.frameLength) frames") } if bufferOut.frameLength > 0 { if bufferOut.isSilent { print("(haveData) SILENT BUFFER at frame \(totalFrames), pending: \(pendingFrames), inputPackets=\(bufferIn.packetCount), outputFrames=\(bufferOut.frameLength)") } outBuffers.append(bufferOut) totalFrames += Int(bufferOut.frameLength) } case .inputRanDry: if verbose { print("🔁 convert returned inputRanDry: \(bufferOut.frameLength) frames") } if bufferOut.frameLength > 0 { if bufferOut.isSilent { print("(inputRanDry) SILENT BUFFER at frame \(totalFrames), pending: \(pendingFrames), inputPackets=\(bufferIn.packetCount), outputFrames=\(bufferOut.frameLength)") } outBuffers.append(bufferOut) totalFrames += Int(bufferOut.frameLength) } return // wait for next handlePackets case .endOfStream: if verbose { print("✅ convert returned endOfStream") } return case .error: if verbose { print("❌ convert returned error") } if let error = error { print("error converting: \(error.localizedDescription)") } return @unknown default: fatalError() } } }
Replies
0
Boosts
0
Views
589
Activity
Jul ’25
Regarding the issue of obtaining input channels for aggregated devices
I found that the aggregated device correctly obtains input channels in the standard microphone mode. However, in voice isolation mode, it only retrieves channels from the first sub-device in the aggregated device's list. If I want to properly obtain channel information in voice isolation mode, how should I do it?
Replies
0
Boosts
0
Views
546
Activity
Jun ’25
In Speech framework is SFTranscriptionSegment timing supposed to be off and speechRecognitionMetadata nil until isFinal?
I'm working in Swift/SwiftUI, running XCode 16.3 on macOS 15.4 and I've seen this when running in the iOS simulator and in a macOS app run from XCode. I've also seen this behaviour with 3 different audio files. Nothing in the documentation says that the speechRecognitionMetadata property on an SFSpeechRecognitionResult will be nil until isFinal, but that's the behaviour I'm seeing. I've stripped my class down to the following: private var isAuthed = false // I call this in a .task {} in my SwiftUI View public func requestSpeechRecognizerPermission() { SFSpeechRecognizer.requestAuthorization { authStatus in Task { self.isAuthed = authStatus == .authorized } } } public func transcribe(from url: URL) { guard isAuthed else { return } let locale = Locale(identifier: "en-US") let recognizer = SFSpeechRecognizer(locale: locale) let recognitionRequest = SFSpeechURLRecognitionRequest(url: url) // the behaviour occurs whether I set this to true or not, I recently set // it to true to see if it made a difference recognizer?.supportsOnDeviceRecognition = true recognitionRequest.shouldReportPartialResults = true recognitionRequest.addsPunctuation = true recognizer?.recognitionTask(with: recognitionRequest) { (result, error) in guard result != nil else { return } if result!.isFinal { //speechRecognitionMetadata is not nil } else { //speechRecognitionMetadata is nil } } } } Further, and this isn't documented either, the SFTranscriptionSegment values don't have correct timestamp and duration values until isFinal. The values aren't all zero, but they don't align with the timing in the audio and they change to accurate values when isFinal is true. The transcription otherwise "works", in that I get transcription text before isFinal and if I wait for isFinal the segments are correct and speechRecognitionMetadata is filled with values. The context here is I'm trying to generate a transcription that I can then highlight the spoken sections of as audio plays and I'm thinking I must be just trying to use the Speech framework in a way it does not work. I got my concept working if I pre-process the audio (i.e. run it through until isFinal and save the results I need to json), but being able to do even a rougher version of it 'on the fly' - which requires segments to have the right timestamp/duration before isFinal - is perhaps impossible?
Replies
1
Boosts
0
Views
197
Activity
Jul ’25
How to toggle usb device
Can i use iokit usb lib to disable build-in camera?
Replies
4
Boosts
0
Views
310
Activity
Jun ’25
Delay in Microphone Input When Talking While Receiving Audio in PTT Framework (Full Duplex Mode)
Context: I am currently developing an app using the Push-to-Talk (PTT) framework. I have reviewed both the PTT framework documentation and the CallKit demo project to better understand how to properly manage audio session activation and AVAudioEngine setup. I am not activating the audio session manually. The audio session configuration is handled in the incomingPushResult or didBeginTransmitting callbacks from the PTChannelManagerDelegate. I am using a single AVAudioEngine instance for both input and playback. The engine is started in the didActivate callback from the PTChannelManagerDelegate. When I receive a push in full duplex mode, I set the active participant to the user who is speaking. Issue When I attempt to talk while the other participant is already speaking, my input tap on the input node takes a few seconds to return valid PCM audio data. Initially, it returns an empty PCM audio block. Details: The audio session is already active and configured with .playAndRecord. The input tap is already installed when the engine is started. When I talk from a neutral state (no one is speaking), the system plays the standard "microphone activation" tone, which covers this initial delay. However, this does not happen when I am already receiving audio. Assumptions / Current Setup Because the audio session is active in play and record, I assumed that microphone input would be available immediately, even while receiving audio. However, there seems to be a delay before valid input is delivered to the tap, only occurring when switching from a receive state to simultaneously talking. Questions Is this expected behavior when using the PTT framework in full duplex mode with a shared AVAudioEngine? Should I be restarting or reconfiguring the engine or audio session when beginning to talk while receiving audio? Is there a recommended pattern for managing microphone readiness in this scenario to avoid the initial empty PCM buffer? Would using separate engines for input and output improve responsiveness? I would like to confirm the correct approach to handling simultaneous talk and receive in full duplex mode using PTT framework and AVAudioEngine. Specifically, I need guidance on ensuring the microphone is ready to capture audio immediately without the delay seen in my current implementation. Relevant Code Snippets Engine Setup func setup() { let input = audioEngine.inputNode do { try input.setVoiceProcessingEnabled(true) } catch { print("Could not enable voice processing \(error)") return } input.isVoiceProcessingAGCEnabled = false let output = audioEngine.outputNode let mainMixer = audioEngine.mainMixerNode audioEngine.connect(pttPlayerNode, to: mainMixer, format: outputFormat) audioEngine.connect(beepNode, to: mainMixer, format: outputFormat) audioEngine.connect(mainMixer, to: output, format: outputFormat) // Initialize converters converter = AVAudioConverter(from: inputFormat, to: outputFormat)! f32ToInt16Converter = AVAudioConverter(from: outputFormat, to: inputFormat)! audioEngine.prepare() } Input Tap Installation func installTap() { guard AudioHandler.shared.checkMicrophonePermission() else { print("Microphone not granted for recording") return } guard !isInputTapped else { print("[AudioEngine] Input is already tapped!") return } let input = audioEngine.inputNode let microphoneFormat = input.inputFormat(forBus: 0) let microphoneDownsampler = AVAudioConverter(from: microphoneFormat, to: outputFormat)! let desiredFormat = outputFormat let inputFramesNeeded = AVAudioFrameCount((Double(OpusCodec.DECODED_PACKET_NUM_SAMPLES) * microphoneFormat.sampleRate) / desiredFormat.sampleRate) input.installTap(onBus: 0, bufferSize: inputFramesNeeded, format: input.inputFormat(forBus: 0)) { [weak self] buffer, when in guard let self = self else { return } // Output buffer: 1920 frames at 16kHz guard let outputBuffer = AVAudioPCMBuffer(pcmFormat: desiredFormat, frameCapacity: AVAudioFrameCount(OpusCodec.DECODED_PACKET_NUM_SAMPLES)) else { return } outputBuffer.frameLength = outputBuffer.frameCapacity let inputBlock: AVAudioConverterInputBlock = { inNumPackets, outStatus in outStatus.pointee = .haveData return buffer } var error: NSError? let converterResult = microphoneDownsampler.convert(to: outputBuffer, error: &error, withInputFrom: inputBlock) if converterResult != .haveData { DebugLogger.shared.print("Downsample error \(converterResult)") } else { self.handleDownsampledBuffer(outputBuffer) } } isInputTapped = true }
Replies
4
Boosts
0
Views
539
Activity
Aug ’25
AirPlay v1 is broken in iOS 18.4?
After upgrading to iOS 18.4, I'm no longer able to establish an AirPlay v1 connection to an audio system. The symptom is that the AirPlay route picker just spins when trying to connect to an audio system. It eventually gives up. I tested this on an iPhone 14, connecting to a HomePod, AirPort express, AppleTV and a Wiim Pro. If I try connecting with AirPlay v2, ex: using Apple Music, the connection succeeds and audio can be played. I'm the developer of an app that plays audio over AirPlay while also recording. My app has to use AirPlay v1 because AvAudioSession doesn't allow the policy .longFormAudio when the category is .playAndRecord. This issue is a real pain as it means my app is suddenly broken for many thousands of users. Is anyone else seeing this issue? Any suggestions for a workaround?
Replies
2
Boosts
3
Views
676
Activity
Jun ’25
AVAudioUnit host - PCM buffer output silent
Hi, I just started to develop audio unit hosting support in my application. Offline rendering seems to work except that I hear no output, but why? I suspect with the player goes something wrong. I connect to CoreAudio in a different location in the code. Here are some error messages I faced so far: 2025-08-14 19:42:04.132930+0200 com.gsequencer.GSequencer[34358:18611871] [avae] AVAudioEngineGraph.mm:4668 Can't retrieve source node to play sequence because there is no output node! 2025-08-14 19:42:04.151171+0200 com.gsequencer.GSequencer[34358:18611871] [avae] AVAudioEngineGraph.mm:4668 Can't retrieve source node to play sequence because there is no output node! 2025-08-14 19:43:08.344530+0200 com.gsequencer.GSequencer[34358:18614927] AUAudioUnit.mm:1417 Cannot set maximumFramesToRender while render resources allocated. 2025-08-14 19:43:08.346583+0200 com.gsequencer.GSequencer[34358:18614927] [avae] AVAEInternal.h:104 [AVAudioSequencer.mm:121:-[AVAudioSequencer(AVAudioSequencer_Player) startAndReturnError:]: (impl->Start()): error -10852 ** (<unknown>:34358): WARNING **: 19:43:08.346: error during audio sequencer start - -10852 I have implemented an AVAudioEngine based AudioUnit host. Here I instantiate player and effect: /* audio engine */ audio_engine = [[AVAudioEngine alloc] init]; fx_audio_unit_audio->audio_engine = (gpointer) audio_engine; av_format = (AVAudioFormat *) fx_audio_unit_audio->av_format; /* av audio player node */ av_audio_player_node = [[AVAudioPlayerNode alloc] init]; /* av audio unit */ av_audio_unit_effect = [[AVAudioUnitEffect alloc] initWithAudioComponentDescription:[((AVAudioUnitComponent *) AGS_AUDIO_UNIT_PLUGIN(base_plugin)->component) audioComponentDescription]]; av_audio_unit = (AVAudioUnit *) av_audio_unit_effect; fx_audio_unit_audio->av_audio_unit = av_audio_unit; /* audio sequencer */ av_audio_sequencer = [[AVAudioSequencer alloc] initWithAudioEngine:audio_engine]; fx_audio_unit_audio->av_audio_sequencer = (gpointer) av_audio_sequencer; /* output node */ [[AVAudioOutputNode alloc] init]; /* audio player and audio unit */ [audio_engine attachNode:av_audio_player_node]; [audio_engine attachNode:av_audio_unit]; [audio_engine connect:av_audio_player_node to:av_audio_unit format:av_format]; [audio_engine connect:av_audio_unit to:[audio_engine outputNode] format:av_format]; ns_error = NULL; [audio_engine enableManualRenderingMode:AVAudioEngineManualRenderingModeOffline format:av_format maximumFrameCount:buffer_size error:&ns_error]; if(ns_error != NULL && [ns_error code] != noErr){ g_warning("enable manual rendering mode error - %d", [ns_error code]); } ns_error = NULL; [[av_audio_unit AUAudioUnit] allocateRenderResourcesAndReturnError:&ns_error]; if(ns_error != NULL && [ns_error code] != noErr){ g_warning("Audio Unit allocate render resources returned error - ErrorCode %d", [ns_error code]); } Then I render in a dedicated thread. ns_error = NULL; [audio_engine startAndReturnError:&ns_error]; if(ns_error != NULL && [ns_error code] != noErr){ g_warning("error during audio engine start - %d", [ns_error code]); } [av_audio_sequencer prepareToPlay]; ns_error = NULL; [av_audio_sequencer startAndReturnError:&ns_error]; if(ns_error != NULL && [ns_error code] != noErr){ g_warning("error during audio sequencer start - %d", [ns_error code]); } [av_audio_player_node play]; while(is_running){ /* pre sync */ /* IO buffers */ av_output_buffer = (AVAudioPCMBuffer *) scope_data->av_output_buffer; av_input_buffer = (AVAudioPCMBuffer *) scope_data->av_input_buffer; /* fill input buffer */ /* schedule av input buffer */ frame_position = 0; // (gint64) ((note_offset * absolute_delay) + delay_counter) * buffer_size; av_audio_player_node = (AVAudioPlayerNode *) fx_audio_unit_audio->av_audio_player_node; AVAudioTime *av_audio_time = [[AVAudioTime alloc] initWithHostTime:frame_position sampleTime:frame_position atRate:((double) samplerate)]; [av_audio_player_node scheduleBuffer:av_input_buffer atTime:av_audio_time options:0 completionHandler:nil]; /* render */ ns_error = NULL; status = [audio_engine renderOffline:AGS_FX_AUDIO_UNIT_AUDIO_FIXED_BUFFER_SIZE toBuffer:av_output_buffer error:&ns_error]; if(ns_error != NULL && [ns_error code] != noErr){ g_warning("render offline error - %d", [ns_error code]); } } regards, Joël
Replies
3
Boosts
0
Views
555
Activity
Aug ’25