I am trying to use SpeechDetector Module in Speech framework along with SpeechTranscriber. and it is giving me an error
Cannot convert value of type 'SpeechDetector' to expected element type 'Array.ArrayLiteralElement' (aka 'any SpeechModule')
Below is how I am using it
                let speechDetector = Speech.SpeechDetector()
            
                let transcriber = SpeechTranscriber(locale: Locale.current,
                                                        transcriptionOptions: [],
                                                        reportingOptions: [.volatileResults],
                                                        attributeOptions: [.audioTimeRange])
                speechAnalyzer = try SpeechAnalyzer(modules: [transcriber,speechDetector])
                    
                  
                Explore the integration of media technologies within your app. Discuss working with audio, video, camera, and other media functionalities.
  
    
    Selecting any option will automatically load the page
  
  
  
  
    
  
  
          Post
Replies
Boosts
Views
Activity
                    
                      I am trying to get MIDI output from the AU Host demo app using the recent MIDI processor example. The processor works correctly in Logic Pro, but I cannot send MIDI from the AUv3 extension in standalone mode using the default host app to another program (e.g., Ableton).
The MIDI manager, which is part of the standalone host app, works fine, and I can send MIDI using it directly—Ableton receives it without issues. I have already set the midiOutputNames in the extension, and the midiOutBlock is mapped. However, the MIDI data from the AUv3 extension does not reach Ableton in standalone mode. I suspect the issue is that midiOutBlock might never be called in the plugin, or perhaps an input to the plugin is missing, which prevents it from sending MIDI. I am currently using the default routing.
I have modified the MIDI manager such that it works well as described above. Here is a part of my code for SimplePlayEngine.swift and my MIDIManager.swift for reference:
@MainActor
@Observable
public class SimplePlayEngine {
    
    private let midiOutBlock: AUMIDIOutputEventBlock = { sampleTime, cable, length, data in return noErr }
    var scheduleMIDIEventListBlock: AUMIDIEventListBlock? = nil
    
    public init() {
        engine.attach(player)
        
        engine.prepare()
        setupMIDI()
    }
    
    private func setupMIDI() {
        if !MIDIManager.shared.setupPort(midiProtocol: MIDIProtocolID._2_0, receiveBlock: { [weak self] eventList, _ in
            if let scheduleMIDIEventListBlock = self?.scheduleMIDIEventListBlock {
                _ = scheduleMIDIEventListBlock(AUEventSampleTimeImmediate, 0, eventList)
            }
        }) {
            fatalError("Failed to setup Core MIDI")
        }
    }
    
    func initComponent(type: String, subType: String, manufacturer: String) async -> ViewController? {
        reset()
        
        guard let component = AVAudioUnit.findComponent(type: type, subType: subType, manufacturer: manufacturer) else {
            fatalError("Failed to find component with type: \(type), subtype: \(subType), manufacturer: \(manufacturer))" )
        }
        
        do {
            let audioUnit = try await AVAudioUnit.instantiate(
                with: component.audioComponentDescription, options: AudioComponentInstantiationOptions.loadOutOfProcess)
            
            self.avAudioUnit = audioUnit
            self.connect(avAudioUnit: audioUnit)
            return await audioUnit.loadAudioUnitViewController()
        } catch {
            return nil
        }
    }
    
    private func startPlayingInternal() {
        guard let avAudioUnit = self.avAudioUnit else { return }
        setSessionActive(true)
        
        if avAudioUnit.wantsAudioInput { scheduleEffectLoop() }
        
        let hardwareFormat = engine.outputNode.outputFormat(forBus: 0)
        engine.connect(engine.mainMixerNode, to: engine.outputNode, format: hardwareFormat)
        
        do { try engine.start() } catch {
            isPlaying = false
            fatalError("Could not start engine. error: \(error).")
        }
        
        if avAudioUnit.wantsAudioInput { player.play() }
        isPlaying = true
    }
    
    
    private func resetAudioLoop() {
        guard let avAudioUnit = self.avAudioUnit else { return }
        if avAudioUnit.wantsAudioInput {
            guard let format = file?.processingFormat else { fatalError("No AVAudioFile defined.") }
            engine.connect(player, to: engine.mainMixerNode, format: format)
        }
    }
    
    public func connect(avAudioUnit: AVAudioUnit?, completion: @escaping (() -> Void) = {}) {
        guard let avAudioUnit = self.avAudioUnit else { return }
        
        engine.disconnectNodeInput(engine.mainMixerNode)
        resetAudioLoop()
        engine.detach(avAudioUnit)
        
        func rewiringComplete() {
            scheduleMIDIEventListBlock = auAudioUnit.scheduleMIDIEventListBlock
            if isPlaying { player.play() }
            completion()
        }
        
        let hardwareFormat = engine.outputNode.outputFormat(forBus: 0)
        engine.connect(engine.mainMixerNode, to: engine.outputNode, format: hardwareFormat)
        if isPlaying { player.pause() }
        
        let auAudioUnit = avAudioUnit.auAudioUnit
        if !auAudioUnit.midiOutputNames.isEmpty { auAudioUnit.midiOutputEventBlock = midiOutBlock }
        engine.attach(avAudioUnit)
        
        if avAudioUnit.wantsAudioInput {
            engine.disconnectNodeInput(engine.mainMixerNode)
            if let format = file?.processingFormat {
                engine.connect(player, to: avAudioUnit, format: format)
                engine.connect(avAudioUnit, to: engine.mainMixerNode, format: format)
            }
        } else {
            let stereoFormat = AVAudioFormat(standardFormatWithSampleRate: hardwareFormat.sampleRate, channels: 2)
            engine.connect(avAudioUnit, to: engine.mainMixerNode, format: stereoFormat)
        }
        rewiringComplete()
    }
}
and my MIDI Manager
@MainActor
class MIDIManager: Identifiable, ObservableObject {
    
    func setupPort(midiProtocol: MIDIProtocolID,
                   receiveBlock: @escaping @Sendable MIDIReceiveBlock) -> Bool {
        guard setupClient() else { return false }
        
        if MIDIInputPortCreateWithProtocol(client, portName, midiProtocol, &port, receiveBlock) != noErr {
            return false
        }
        
        for source in self.sources {
            if MIDIPortConnectSource(port, source, nil) != noErr {
                print("Failed to connect to source \(source)")
                return false
            }
        }
        
        setupVirtualMIDIOutput()
        return true
    }
    
    private func setupVirtualMIDIOutput() {
        let virtualStatus = MIDISourceCreate(client, virtualSourceName, &virtualSource)
        if virtualStatus != noErr {
            print("❌ Failed to create virtual MIDI source: \(virtualStatus)")
        } else {
            print("✅ Created virtual MIDI source: \(virtualSourceName)")
        }
    }
    
    
    func sendMIDIData(_ data: [UInt8]) {
        print("hey")
        var packetList = MIDIPacketList()
        withUnsafeMutablePointer(to: &packetList) { ptr in
            let pkt = MIDIPacketListInit(ptr)
            _ = MIDIPacketListAdd(ptr, 1024, pkt, 0, data.count, data)
            
            if virtualSource != 0 {
                let status = MIDIReceived(virtualSource, ptr)
                if status != noErr {
                    print("❌ Failed to send MIDI data: \(status)")
                } else {
                    print("✅ Sent MIDI data: \(data)")
                }
            }
        }
    }
    
}
                    
                  
                
                    
                      Background: For iOS, I have built a custom video record app that records video at the highest possible FPS (supposed to be around 240 FPS but more like ~200 in practice). When I save this video to the user's camera roll, I notice this dynamic playback speed bar. This bar will speed up or slow down the video.
My question: How can I save my video such that this playback speed bar is constantly slow or plays at real time speed?
For reference I have include the playback speed bar that I am talking about in the screenshot, you can find this in the photo app when you record slow motion video.
                    
                  
                
              
                
              
              
                
                Topic:
                  
	
		Media Technologies
  	
                
                
                SubTopic:
                  
                    
	
		Video
		
  	
                  
                
              
              
              
  
  
    
    
  
  
              
                
                
              
            
          
                    
                      A bit of a novice to app development here but I have a paid developer account, I have registered the identifier for MusicKit on the developer website (using the bundle identifier I've selected in Xcode) but the option to add MusicKit as a capability is not available in Xcode?
I've manually updated the certificates, closed the app and reopened it, started a new project and tried with a different demo project?
Apologies if I am missing something obvious but could someone help me get this capability added?
                    
                  
                
                    
                      Any one experience this bug, when playing a video the bluetooth headphones loose audio? my workaround is to select from one of the other audio outputs, and go back again and select the affected headphone.
                    
                  
                
              
                
              
              
                
                Topic:
                  
	
		Media Technologies
  	
                
                
                SubTopic:
                  
                    
	
		Video
		
  	
                  
                
              
              
              
  
  
    
    
  
  
              
                
                
              
            
          
                    
                      Some users reported that their images are not loading correctly in our app. After a lot of debugging we identified the following:
This only happens when the app is build for Mac Catalyst. Not on iOS, iPadOS, or “real” macOS (AppKit).
The images in question have unusual color spaces. We observed the issue for uRGB and eciRGB v2.
Those images are rendered correctly in Photos and Preview on all platforms.
When displaying the image inside of a UIImageView or in a SwiftUI Image, they render correctly.
The issue only occurs when loading the image via Core Image.
When comparing the different Core Image render graphs between AppKit (working) and Catalyst (faulty) builds, they look identical—except for the result.
Mac (AppKit):
Catalyst:
Something seems to be off when Core Image tries to load an image with foreign color space in Catalyst.
We identified a workaround: By using a CGImageDestination to transcode the image using the kCGImageDestinationOptimizeColorForSharing option, Image I/O will convert the image to sRGB (or similar) and Core Image is able to load the image correctly. However, one potentially loses fidelity this way.
Or might there be a better workaround?
                    
                  
                
              
                
              
              
                
                Topic:
                  
	
		Media Technologies
  	
                
                
                SubTopic:
                  
                    
	
		Photos & Camera
		
  	
                  
                
              
              
                Tags:
              
              
  
  
    
      
      
      
        
          
            Image I/O
          
        
        
      
      
    
      
      
      
        
          
            Photos and Imaging
          
        
        
      
      
    
      
      
      
        
          
            Core Image
          
        
        
      
      
    
      
      
      
        
          
            Core Graphics
          
        
        
      
      
    
  
  
              
                
                
              
            
          
                    
                      Hello,
My company has an in-store app with FPS SDK 4.x (1024) keys. We've handed those keys over to a trusted third-party and we do not have them. We've been in-store for several years.
The person that created the keys in our organization mistakenly stored them encrypted to our third-party's PGP keys, so we cannot decrypt them, and the third party also has no mechanism to provide us with the keys even though it is in their runtime environment. They only have secure mechanisms for us to upload keys onto their servers.
We are trying to migrate to a different third-party DRM provider, and would like to obtain new keys. Unfortunately, the developer portal won't let me create new keys, saying that we have exceeded the number of keys allowed, which I assume is one.
Additionally, the new DRM provider can only support SDK 4.x keys, and it appears that we can only request SDK 5.x keys on the Apple Developer portal, as the SDK 4.0 option is grayed out. Regardless, it seems that we are not able to request any keys.
We've submitted a request to the support e-mail address and received an automated e-mail that the response should take a few days, but may take longer on occasion. It's now been a month. The e-mail says that the reply address is not monitored. Is there any way we can accelerate this?
Thank you,
Carlos
                    
                  
                
                    
                      Hello,
I’m new here. I'm developing an iOS app and I’d like to know whether it is possible to detect if a phone call is being recorded by another app running in the background.
I’ve already reviewed the documentation for CallKit and AVAudioSession, but I couldn’t find anything related. My expectation was that iOS might provide some callback or API to indicate if a call is being recorded (third-party apps), but so far I haven’t found a way.
My questions are:
Does iOS expose any API to detect if a call is being recorded?
If not, is there any indirect, Apple's policy compliant method (e.g., microphone usage events) that can be relied upon?
Or is this something that iOS explicitly prevents for privacyreasons?
Expecting solutions that align with Apple’s policies and would be accepted under the App Store Review Guidelines.
Thanks in advance for any guidance.
                    
                  
                
                    
                      AVAudioSessionCategoryOptionAllowBluetooth is marked as deprecated in iOS 8 in iOS 26 beta 5 when this option was not deprecated in iOS 18.6. I think this is a mistake and the deprecation is in iOS 26. Am I right?
It seems that the substitute for this option is "AVAudioSessionCategoryOptionAllowBluetoothHFP". The documentation does not make clear if the behaviour is exactly the same or if any difference should be expected... Has anyone used this option in iOS 26? Should I expect any difference with the current behaviour of "AVAudioSessionCategoryOptionAllowBluetooth"?
Thank you.
                    
                  
                
                    
                      When I try to send a DRM-protected video via Airplay to an Apple TV, the license request is made twice instead of once as it normally does on iOS.
We only allow one request per session for security reasons, this causes the second request to fail and the video won't play.
We've tested DRM-protected videos without token usage limits and it works, but this creates a security hole in our system.
Why does it request the license twice in function: func contentKeySession(_ session: AVContentKeySession, didProvide keyRequest: AVContentKeyRequest)?
Is there a way to prevent this?
                    
                  
                
                    
                      I am working to update a live blog in Apple News. As far as I know there is an update endpoint to update a content in Apple News. Is there any feature in Apple News to trigger an event when the original content updated and pull the updated content?
                    
                  
                
                    
                      I'm experiencing a significant limitation with MusicKit's Dolby Atmos implementation on macOS and would appreciate clarification on whether this is intended behavior or if there are solutions available.
When streaming Dolby Atmos content through MusicKit's ApplicationMusicPlayer, the output is limited to 2-channel stereo, even when:
Audio MIDI Setup is configured for 7.1.4 (12-channel) output
The same tracks play in full multichannel through the native Apple Music app
Dolby Atmos is set to "Automatic" in Apple Music preferences
Please let me know if there is anyway to enable this. If not, is this documented anywhere? Thanks!
                    
                  
                
                    
                      Two issues:
No matter what I set in
try audioSession.setPreferredSampleRate(x)
the sample rate on both iOS and macOS is always 48000 when the output goes through the speaker, and 24000 when my Airpods connect to an iPhone/iPad.
Now, I'm checking the current output loudness to animate a 3D character, using
mixerNode.installTap(onBus: 0, bufferSize: y, format: nil) { [weak self] buffer, time in
    Task { @MainActor in
         // calculate rms and animate character accordingly
but any buffer size under 4800 is just ignored and the buffers I get are 4800 sized.
This is ok, when the sampleRate is currently 48000, as 10 samples per second lead to decent visual results.
But when AirPods connect, the samplerate is 24000, which means only 5 samples per second, so the character animation looks lame.
My AVAudioEngine setup is the following:
audioEngine.connect(playerNode, to: pitchShiftEffect, format: format)
audioEngine.connect(pitchShiftEffect, to: mixerNode, format: format)
audioEngine.connect(mixerNode, to: audioEngine.outputNode, format: nil)
Now, I'd be fine if the outputNode runs at whatever if it needs, as long as my tap would get at least 10 samples per second.
PS: Specifying my favorite format in the
let format = AVAudioFormat(standardFormatWithSampleRate: 48_000, channels: 2)!
mixerNode.installTap(onBus: 0, bufferSize: y, format: format)
doesn't change anything either
                    
                  
                
                    
                      Hello,
I'm developing an app that displays a photo library using UICollectionView and PHCachingImageManager. I'd like to achieve a user experience similar to the native iOS Photos app, where low-quality images are shown quickly while scrolling, and higher-quality images are loaded for visible cells once scrolling stops.
I'm currently using the following approach:
While Scrolling: I'm using the UICollectionViewDataSourcePrefetching protocol. In the prefetchItemsAt method, I call startCachingImages with low-quality options to cache images in advance.
After Scrolling Stops: In the scrollViewDidEndDecelerating method, I intend to load high-quality images for the currently visible cells.
I have a few questions regarding this approach:
What is the best practice for managing both low-quality and high-quality images efficiently with PHCachingImageManager? Is it correct to call startCachingImages with fastFormat options and then call it again with highQualityFormat in scrollViewDidEndDecelerating?
How can I minimize the delay when a low-quality image is replaced by a high-quality one? Are there any additional strategies to help pre-load high-quality images more effectively?
                    
                  
                
              
                
              
              
                
                Topic:
                  
	
		Media Technologies
  	
                
                
                SubTopic:
                  
                    
	
		Photos & Camera
		
  	
                  
                
              
              
              
  
  
    
    
  
  
              
                
                
              
            
          
                    
                      In my app I use AVAssetReaderTrackOutput to extract PCM audio from a user-provided video or audio file and display it as a waveform.
Recently a user reported that the waveform is not in sync with his video, and after receiving the video I noticed that the waveform is in fact double as long as the video duration, i.e. it shows the audio in slow-motion, so to speak.
Until now I was using
CMFormatDescription.audioStreamBasicDescription.mSampleRate
which for this particular user video returns 22'050. But in this case it seems that this value is wrong... because the audio file has two audio channels with different sample rates, as returned by
CMFormatDescription.audioFormatList.map({ $0.mASBD.mSampleRate })
The first channel has a sample rate of 44'100, the second one 22'050. If I use the first sample rate, the waveform is perfectly in sync with the video.
The problem is given by the fact that the ratio between the audio data length and the sample rate multiplied by the audio duration is 8, double the ratio for the first audio file (4). In the code below this ratio is given by
Double(length) / (sampleRate * asset.duration.seconds)
When commenting out the line with the sampleRate variable definition in the code below and uncommenting the following line, the ratios for both audio files are 4, which is the expected result. I would expect audioStreamBasicDescription to return the correct sample rate, i.e. the one used by AVAssetReaderTrackOutput, which (I think) somehow merges the stereo tracks. The documentation is sparse, and in particular it’s not documented whether the lower or higher sample rate is used; in this case, it seems like the higher one is used, but audioStreamBasicDescription for some reason returns the lower one.
Does anybody know why this is the case or how I should extract the sample rate of the produced PCM audio data? Should I always take the higher one?
I created FB19620455.
let openPanel = NSOpenPanel()
openPanel.allowedContentTypes = [.audiovisualContent]
openPanel.runModal()
let url = openPanel.urls[0]
let asset = AVURLAsset(url: url)
let assetTrack = asset.tracks(withMediaType: .audio)[0]
let assetReader = try! AVAssetReader(asset: asset)
let readerOutput = AVAssetReaderTrackOutput(track: assetTrack, outputSettings: [AVFormatIDKey: Int(kAudioFormatLinearPCM), AVLinearPCMBitDepthKey: 16, AVLinearPCMIsBigEndianKey: false, AVLinearPCMIsFloatKey: false, AVLinearPCMIsNonInterleaved: false])
readerOutput.alwaysCopiesSampleData = false
assetReader.add(readerOutput)
let formatDescriptions = assetTrack.formatDescriptions as! [CMFormatDescription]
let sampleRate = formatDescriptions[0].audioStreamBasicDescription!.mSampleRate
//let sampleRate = formatDescriptions[0].audioFormatList.map({ $0.mASBD.mSampleRate }).max()!
print(formatDescriptions[0].audioStreamBasicDescription!.mSampleRate)
print(formatDescriptions[0].audioFormatList.map({ $0.mASBD.mSampleRate }))
if !assetReader.startReading() {
    preconditionFailure()
}
var length = 0
while assetReader.status == .reading {
    guard let sampleBuffer = readerOutput.copyNextSampleBuffer(), let blockBuffer = sampleBuffer.dataBuffer else {
        break
    }
    length += blockBuffer.dataLength
}
print(Double(length) / (sampleRate * asset.duration.seconds))
                    
                  
                
                    
                      So experimenting with the new SpeechTranscriber, if I do:
let transcriber = SpeechTranscriber(
    locale: locale,
    transcriptionOptions: [],
    reportingOptions: [.volatileResults],
    attributeOptions: [.audioTimeRange]
)
only the final result has audio time ranges, not the volatile results.
Is this a performance consideration? If there is no performance problem, it would be nice to have the option to also get speech time ranges for volatile responses.
I'm not presenting the volatile text at all in the UI, I was just trying to keep statistics about the non-speech and the speech noise level, this way I can determine when the noise level falls under the noisefloor for a while.
The goal here was to finalize the recording automatically, when the noise level indicate that the user has finished speaking.
                    
                  
                
                    
                      So,
I've been wondering how fast a an offline STT -> ML Prompt -> TTS roundtrip would be.
Interestingly, for many tests, the SpeechTranscriber (STT) takes the bulk of the time, compared to generating a FoundationModel response and creating the Audio using TTS.
E.g.
        InteractionStatistics:
        - listeningStarted:             21:24:23 4480 2423
        - timeTillFirstAboveNoiseFloor: 01.794
        - timeTillLastNoiseAboveFloor:  02.383
        - timeTillFirstSpeechDetected:  02.399
        - timeTillTranscriptFinalized:  04.510
        - timeTillFirstMLModelResponse: 04.938
        - timeTillMLModelResponse:      05.379
        - timeTillTTSStarted:           04.962
        - timeTillTTSFinished:          11.016
        - speechLength:                 06.054
        - timeToResponse:               02.578
        - transcript:                   This is a test.
        - mlModelResponse:              Sure! I'm ready to help with your test. What do you need help with?
Here, between my audio input ending and the Text-2-Speech starting top play (using  AVSpeechUtterance) the total response time was 2.5s.
Of that time, it took the SpeechAnalyzer 2.1s to get the transcript finalized, FoundationModel only took 0.4s to respond (and TTS started playing nearly instantly).
I'm already using reportingOptions: [.volatileResults, .fastResults] so it's probably as fast as possible right now?
I'm just surprised the STT takes so much longer compared to the other parts (all being CoreML based, aren't they?)
                    
                  
                
                    
                      After Picture-in-Picture is opened on the camera interface, the custom UI cannot be displayed. Is there any way to make the custom UI visible?  If custom UI cannot be displayed, how do teleprompter-type apps in the store manage to show custom teleprompter text within the camera?
                    
                  
                
                    
                      Context:
I am currently developing an app using the Push-to-Talk (PTT) framework. I have reviewed both the PTT framework documentation and the CallKit demo project to better understand how to properly manage audio session activation and AVAudioEngine setup.
I am not activating the audio session manually. The audio session configuration is handled in the incomingPushResult or didBeginTransmitting callbacks from the PTChannelManagerDelegate.
I am using a single AVAudioEngine instance for both input and playback. The engine is started in the didActivate callback from the PTChannelManagerDelegate. When I receive a push in full duplex mode, I set the active participant to the user who is speaking.
Issue
When I attempt to talk while the other participant is already speaking, my input tap on the input node takes a few seconds to return valid PCM audio data. Initially, it returns an empty PCM audio block.
Details:
The audio session is already active and configured with .playAndRecord.
The input tap is already installed when the engine is started.
When I talk from a neutral state (no one is speaking), the system plays the standard "microphone activation" tone, which covers this initial delay. However, this does not happen when I am already receiving audio.
Assumptions / Current Setup
Because the audio session is active in play and record, I assumed that microphone input would be available immediately, even while receiving audio.
However, there seems to be a delay before valid input is delivered to the tap, only occurring when switching from a receive state to simultaneously talking.
Questions
Is this expected behavior when using the PTT framework in full duplex mode with a shared AVAudioEngine?
Should I be restarting or reconfiguring the engine or audio session when beginning to talk while receiving audio?
Is there a recommended pattern for managing microphone readiness in this scenario to avoid the initial empty PCM buffer?
Would using separate engines for input and output improve responsiveness?
I would like to confirm the correct approach to handling simultaneous talk and receive in full duplex mode using PTT framework and AVAudioEngine. Specifically, I need guidance on ensuring the microphone is ready to capture audio immediately without the delay seen in my current implementation.
Relevant Code Snippets
Engine Setup
func setup() {
    let input = audioEngine.inputNode
    do {
        try input.setVoiceProcessingEnabled(true)
    } catch {
        print("Could not enable voice processing \(error)")
        return
    }
    input.isVoiceProcessingAGCEnabled = false
    let output = audioEngine.outputNode
    let mainMixer = audioEngine.mainMixerNode
    audioEngine.connect(pttPlayerNode, to: mainMixer, format: outputFormat)
    audioEngine.connect(beepNode, to: mainMixer, format: outputFormat)
    audioEngine.connect(mainMixer, to: output, format: outputFormat)
    // Initialize converters
    converter = AVAudioConverter(from: inputFormat, to: outputFormat)!
    f32ToInt16Converter = AVAudioConverter(from: outputFormat, to: inputFormat)!
    audioEngine.prepare()
}
Input Tap Installation
func installTap() {
    guard AudioHandler.shared.checkMicrophonePermission() else {
        print("Microphone not granted for recording")
        return
    }
    guard !isInputTapped else {
        print("[AudioEngine] Input is already tapped!")
        return
    }
    let input = audioEngine.inputNode
    let microphoneFormat = input.inputFormat(forBus: 0)
    let microphoneDownsampler = AVAudioConverter(from: microphoneFormat, to: outputFormat)!
    let desiredFormat = outputFormat
    let inputFramesNeeded = AVAudioFrameCount((Double(OpusCodec.DECODED_PACKET_NUM_SAMPLES) * microphoneFormat.sampleRate) / desiredFormat.sampleRate)
    input.installTap(onBus: 0, bufferSize: inputFramesNeeded, format: input.inputFormat(forBus: 0)) { [weak self] buffer, when in
        guard let self = self else { return }
        // Output buffer: 1920 frames at 16kHz
        guard let outputBuffer = AVAudioPCMBuffer(pcmFormat: desiredFormat, frameCapacity: AVAudioFrameCount(OpusCodec.DECODED_PACKET_NUM_SAMPLES)) else { return }
        outputBuffer.frameLength = outputBuffer.frameCapacity
        let inputBlock: AVAudioConverterInputBlock = { inNumPackets, outStatus in
            outStatus.pointee = .haveData
            return buffer
        }
        var error: NSError?
        let converterResult = microphoneDownsampler.convert(to: outputBuffer, error: &error, withInputFrom: inputBlock)
        if converterResult != .haveData {
            DebugLogger.shared.print("Downsample error \(converterResult)")
        } else {
            self.handleDownsampledBuffer(outputBuffer)
        }
    }
    isInputTapped = true
}
                    
                  
                
                    
                      I am developing an app that plays HLS audio.
When using AVPlayerItem with AVURLAsset, can AVAssetResourceLoaderDelegate correctly handle HLS segments?
My goal is to use AVAssetResourceLoaderDelegate to add authentication HTTP headers when accessing HLS .m3u8 and .ts files.
I can successfully download the files, but playback fails with errors.
Specifically, I am observing the following cases:
A. AVAssetResourceLoaderDelegate is canceled, and CoreMediaErrorDomain -12881 occurs
In NSURLConnectionDataDelegate’s didReceiveResponse method, set contentInformationRequest
In didReceiveData, call dataRequest respondWithData
resourceLoader didCancelLoadingRequest is called
CoreMediaErrorDomain -12881 occurs
B. CoreMediaErrorDomain -12881 occurs
In NSURLConnectionDataDelegate’s didReceiveResponse method, set contentInformationRequest
In connection didReceiveData, buffer all received data until the end
In connectionDidFinishLoading, pass the buffered data to respondWithData
Call loadingRequest finishLoading
CoreMediaErrorDomain -12881 occurs
In both cases, dataRequest.requestsAllDataToEndOfResource is YES.
For this use case, I am not using AVURLAssetHTTPHeaderFieldsKey because I need to apply the most up-to-date authentication data at the moment each file is accessed.
I would appreciate any advice or suggestions you might have. Thank you in advance!