Explore the integration of media technologies within your app. Discuss working with audio, video, camera, and other media functionalities.

All subtopics
Posts under Media Technologies topic

Post

Replies

Boosts

Views

Activity

How to use the SpeechDetector Module
I am trying to use SpeechDetector Module in Speech framework along with SpeechTranscriber. and it is giving me an error Cannot convert value of type 'SpeechDetector' to expected element type 'Array.ArrayLiteralElement' (aka 'any SpeechModule') Below is how I am using it let speechDetector = Speech.SpeechDetector() let transcriber = SpeechTranscriber(locale: Locale.current, transcriptionOptions: [], reportingOptions: [.volatileResults], attributeOptions: [.audioTimeRange]) speechAnalyzer = try SpeechAnalyzer(modules: [transcriber,speechDetector])
4
2
292
3w
SFSpeechRecognizer is broken on iOS 18
Hello, I noticed that SFSpeechRecognizer is broken on iOS 18. During a recognition task, it keeps dropping the recognized text on every pause. For example, if you say "how are you fine", it will drop the "how are you" part and only give you "fine" as the result. Say "how are you <pause> fine" // iOS 17 ✅ (perfect final result) How How are How are you How are you. How are you. Fine. // iOS 18 ❌ How How are How are you How are you Fine (the text before the pause is dropped, and fail to recognize the punctuations.) Reproducing the issue: Download the official sample project. Run it on an iOS 18 device or simulator. Say "how are you fine" Only "fine" will be displayed.
4
4
1.3k
Oct ’24
Delay in Microphone Input When Talking While Receiving Audio in PTT Framework (Full Duplex Mode)
Context: I am currently developing an app using the Push-to-Talk (PTT) framework. I have reviewed both the PTT framework documentation and the CallKit demo project to better understand how to properly manage audio session activation and AVAudioEngine setup. I am not activating the audio session manually. The audio session configuration is handled in the incomingPushResult or didBeginTransmitting callbacks from the PTChannelManagerDelegate. I am using a single AVAudioEngine instance for both input and playback. The engine is started in the didActivate callback from the PTChannelManagerDelegate. When I receive a push in full duplex mode, I set the active participant to the user who is speaking. Issue When I attempt to talk while the other participant is already speaking, my input tap on the input node takes a few seconds to return valid PCM audio data. Initially, it returns an empty PCM audio block. Details: The audio session is already active and configured with .playAndRecord. The input tap is already installed when the engine is started. When I talk from a neutral state (no one is speaking), the system plays the standard "microphone activation" tone, which covers this initial delay. However, this does not happen when I am already receiving audio. Assumptions / Current Setup Because the audio session is active in play and record, I assumed that microphone input would be available immediately, even while receiving audio. However, there seems to be a delay before valid input is delivered to the tap, only occurring when switching from a receive state to simultaneously talking. Questions Is this expected behavior when using the PTT framework in full duplex mode with a shared AVAudioEngine? Should I be restarting or reconfiguring the engine or audio session when beginning to talk while receiving audio? Is there a recommended pattern for managing microphone readiness in this scenario to avoid the initial empty PCM buffer? Would using separate engines for input and output improve responsiveness? I would like to confirm the correct approach to handling simultaneous talk and receive in full duplex mode using PTT framework and AVAudioEngine. Specifically, I need guidance on ensuring the microphone is ready to capture audio immediately without the delay seen in my current implementation. Relevant Code Snippets Engine Setup func setup() { let input = audioEngine.inputNode do { try input.setVoiceProcessingEnabled(true) } catch { print("Could not enable voice processing \(error)") return } input.isVoiceProcessingAGCEnabled = false let output = audioEngine.outputNode let mainMixer = audioEngine.mainMixerNode audioEngine.connect(pttPlayerNode, to: mainMixer, format: outputFormat) audioEngine.connect(beepNode, to: mainMixer, format: outputFormat) audioEngine.connect(mainMixer, to: output, format: outputFormat) // Initialize converters converter = AVAudioConverter(from: inputFormat, to: outputFormat)! f32ToInt16Converter = AVAudioConverter(from: outputFormat, to: inputFormat)! audioEngine.prepare() } Input Tap Installation func installTap() { guard AudioHandler.shared.checkMicrophonePermission() else { print("Microphone not granted for recording") return } guard !isInputTapped else { print("[AudioEngine] Input is already tapped!") return } let input = audioEngine.inputNode let microphoneFormat = input.inputFormat(forBus: 0) let microphoneDownsampler = AVAudioConverter(from: microphoneFormat, to: outputFormat)! let desiredFormat = outputFormat let inputFramesNeeded = AVAudioFrameCount((Double(OpusCodec.DECODED_PACKET_NUM_SAMPLES) * microphoneFormat.sampleRate) / desiredFormat.sampleRate) input.installTap(onBus: 0, bufferSize: inputFramesNeeded, format: input.inputFormat(forBus: 0)) { [weak self] buffer, when in guard let self = self else { return } // Output buffer: 1920 frames at 16kHz guard let outputBuffer = AVAudioPCMBuffer(pcmFormat: desiredFormat, frameCapacity: AVAudioFrameCount(OpusCodec.DECODED_PACKET_NUM_SAMPLES)) else { return } outputBuffer.frameLength = outputBuffer.frameCapacity let inputBlock: AVAudioConverterInputBlock = { inNumPackets, outStatus in outStatus.pointee = .haveData return buffer } var error: NSError? let converterResult = microphoneDownsampler.convert(to: outputBuffer, error: &error, withInputFrom: inputBlock) if converterResult != .haveData { DebugLogger.shared.print("Downsample error \(converterResult)") } else { self.handleDownsampledBuffer(outputBuffer) } } isInputTapped = true }
4
0
285
Aug ’25
Capturing system audio no longer works with macOS Sequoia
Our capture application records system audio via HAL plugin, however, with the latest macOS 15 Sequoia, all audio buffer values are zero. I am attaching sample code that replicates the problem. Compile as a Command Line Tool application with Xcode. STEPS TO REPRODUCE Install BlackHole 2ch audio driver: https://existential.audio/blackhole/download/?code=1579271348 Start some system audio, e.g. YouTube. Compile and run the sample application. On macOS up to Sonoma, you will hear audio via loopback and see audio values in the debug/console window. On macOS Sequoia, you will not hear audio and the audio values are 0. #import <AVFoundation/AVFoundation.h> #import <CoreAudio/CoreAudio.h> #define BLACKHOLE_UID @"BlackHole2ch_UID" #define DEFAULT_OUTPUT_UID @"BuiltInSpeakerDevice" @interface AudioCaptureDelegate : NSObject <AVCaptureAudioDataOutputSampleBufferDelegate> @end void setDefaultAudioDevice(NSString *deviceUID); @implementation AudioCaptureDelegate // receive samples from CoreAudio/HAL driver and print amplitute values for testing // this is where samples would normally be copied and passed downstream for further processing which // is not needed in this simple sample application - (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection { // Access the audio data in the sample buffer CMBlockBufferRef blockBuffer = CMSampleBufferGetDataBuffer(sampleBuffer); if (!blockBuffer) { NSLog(@"No audio data in the sample buffer."); return; } size_t length; char *data; CMBlockBufferGetDataPointer(blockBuffer, 0, NULL, &length, &data); // Process the audio samples to calculate the average amplitude int16_t *samples = (int16_t *)data; size_t sampleCount = length / sizeof(int16_t); int64_t sum = 0; for (size_t i = 0; i < sampleCount; i++) { sum += abs(samples[i]); } // Calculate and log the average amplitude float averageAmplitude = (float)sum / sampleCount; NSLog(@"Average Amplitude: %f", averageAmplitude); } @end // set the default audio device to Blackhole while testing or speakers when done // called by main void setDefaultAudioDevice(NSString *deviceUID) { AudioObjectPropertyAddress address; AudioDeviceID deviceID = kAudioObjectUnknown; UInt32 size; CFStringRef uidString = (__bridge CFStringRef)deviceUID; // Gets the device corresponding to the given UID. AudioValueTranslation translation; translation.mInputData = &uidString; translation.mInputDataSize = sizeof(uidString); translation.mOutputData = &deviceID; translation.mOutputDataSize = sizeof(deviceID); size = sizeof(translation); address.mSelector = kAudioHardwarePropertyDeviceForUID; address.mScope = kAudioObjectPropertyScopeGlobal; //???? address.mElement = kAudioObjectPropertyElementMain; OSStatus status = AudioObjectGetPropertyData(kAudioObjectSystemObject, &address, 0, NULL, &size, &translation); if (status != noErr) { NSLog(@"Error: Could not retrieve audio device ID for UID %@. Status code: %d", deviceUID, (int)status); return; } AudioObjectPropertyAddress propertyAddress; propertyAddress.mSelector = kAudioHardwarePropertyDefaultOutputDevice; propertyAddress.mScope = kAudioObjectPropertyScopeGlobal; status = AudioObjectSetPropertyData(kAudioObjectSystemObject, &propertyAddress, 0, NULL, sizeof(AudioDeviceID), &deviceID); if (status == noErr) { NSLog(@"Default audio device set to %@", deviceUID); } else { NSLog(@"Failed to set default audio device: %d", status); } } // sets Blackhole device as default and configures it as AVCatureDeviceInput // sets the speakers as loopback so we can hear what is being captured // sets up queue to receive capture samples // runs session for 30 seconds, then restores speakers as default output int main(int argc, const char * argv[]) { @autoreleasepool { // Create the capture session AVCaptureSession *session = [[AVCaptureSession alloc] init]; // Select the audio device AVCaptureDevice *audioDevice = nil; NSString *audioDriverUID = nil; audioDriverUID = BLACKHOLE_UID; setDefaultAudioDevice(audioDriverUID); audioDevice = [AVCaptureDevice deviceWithUniqueID:audioDriverUID]; if (!audioDevice) { NSLog(@"Audio device %s not found!", [audioDriverUID UTF8String]); return -1; } else { NSLog(@"Using Audio device: %s", [audioDriverUID UTF8String]); } // Configure the audio input with the selected device (Blackhole) NSError *error = nil; AVCaptureDeviceInput *audioInput = [AVCaptureDeviceInput deviceInputWithDevice:audioDevice error:&error]; if (error || !audioInput) { NSLog(@"Failed to create audio input: %@", error); return -1; } [session addInput:audioInput]; // Configure the audio data output AVCaptureAudioDataOutput *audioOutput = [[AVCaptureAudioDataOutput alloc] init]; AudioCaptureDelegate *delegate = [[AudioCaptureDelegate alloc] init]; dispatch_queue_t queue = dispatch_queue_create("AudioCaptureQueue", NULL); [audioOutput setSampleBufferDelegate:delegate queue:queue]; [session addOutput:audioOutput]; // Set audio settings NSDictionary *audioSettings = @{ AVFormatIDKey: @(kAudioFormatLinearPCM), AVSampleRateKey: @48000, AVNumberOfChannelsKey: @2, AVLinearPCMBitDepthKey: @16, AVLinearPCMIsFloatKey: @NO, AVLinearPCMIsNonInterleaved: @NO }; [audioOutput setAudioSettings:audioSettings]; AVCaptureAudioPreviewOutput * loopback_output = nil; loopback_output = [[AVCaptureAudioPreviewOutput alloc] init]; loopback_output.volume = 1.0; loopback_output.outputDeviceUniqueID = DEFAULT_OUTPUT_UID; [session addOutput:loopback_output]; const char *deviceID = loopback_output.outputDeviceUniqueID ? [loopback_output.outputDeviceUniqueID UTF8String] : "nil"; NSLog(@"session addOutput for preview/loopback: %s", deviceID); // Start the session [session startRunning]; NSLog(@"Capturing audio data for 30 seconds..."); [[NSRunLoop currentRunLoop] runUntilDate:[NSDate dateWithTimeIntervalSinceNow:30.0]]; // Stop the session [session stopRunning]; NSLog(@"Capture session stopped."); setDefaultAudioDevice(DEFAULT_OUTPUT_UID); } return 0; }
4
0
943
Oct ’24
Failure of AudioUnitSetProperty when using MacCatalyst (works on macOS)
I was trying to set custom audio output device for a generated audio on macCatalyst. While using let status = AudioUnitSetProperty(outputUnit, kAudioOutputUnitProperty_CurrentDevice, kAudioUnitScope_Global, 0, &outputDeviceID, UInt32(MemoryLayout.size)) kAudioOutputUnitProperty_CurrentDevice is invalid, and status = -10879, indicating an error. STEPS TO REPRODUCE Set Run Destination to MacOS and run the program. "AudioUnitSetProperty: 0" should be printed, indicating it works fine. Set Run Destination to Mac Catalyst and run the program. "Error setting output device: -10879" should be printed, indicating an error.
4
1
622
Mar ’25
MPRemoteCommandCenter not updating play/pause button to proper state on iOS
So I'm using AVAudioEngine. When playing audio I become the 'now playing' app using MPNowPlayingInfoCenter/MPRemoteCommandCenter APIs. When configuring MPRemoteCommandCenter I add a play/pause command target via -addTargetWithHandler on the togglePlayPauseCommand property. Now I also have a play/pause button in my app's UI. When I pause playback from my app's UI (which means I'm the active app, I'm in the foreground), what I do is this: -I pause the AVAudioPlayerNode I'm using with AVAudioEngine. I do not, stop, reset, etc. the AVAudioEngine. I only pause the player node. My thought process here is that the user just pressed pause and it is very likely that he will hit 'play' to resume playback in the near future because My app is in the foreground and the user just hit the pause button. Now if my app moves to the background and if I receive a memory warning I presume it'd make sense to tear down the engine or pause it. Perhaps I'm wrong about this? So when I initially hit the play button from my app's UI I also activate my AVAudioSession. I do this in high priority NSOperation since the documentation warns that "we recommend that applications not activate their session from a thread where a long blocking operation will be problematic." So now I'm playing and I hit pause from my app's UI. Then I quickly bring up the "Now Playing" center and I see I'm the "Now Playing" app but the play-pause button is showing the pause icon instead of the play icon but I'm in the pause state. I do set MPNowPlayingInfoCenter's playbackState to MPNowPlayingPlaybackStatePaused when I pause. Not surprisingly this doesn't work. The documentation states this is for macOS only. So the only way to get MPRemoteCommandCenter to show the "play" image for the play-pause button is to deactivate my AVAudioSession when I pause playback? Since I change the active state of my audio session in a NSOperation because documentation recommends "we recommend that applications not activate their session from a thread where a long blocking operation will be problematic." the play-pause toggle in the remote command center won't immediately update since I'm doing it on another thread. IMO it feels kind of inappropriate for a play-pause button to wait on a NSOperation activating the audio session before updating its UI when I already know my play/paused state, it should update right away like the button in my app does. Wouldn't it be nicer to just use MPNowPlayingInfoCenter's playbackState property on iOS too? If I'm no the longer the now playing app/active audio session it doesn't matter since I'm not in the now playing UI, just ignore it? Also is it recommended that I deactivate my audio session explicitly every time the user pauses audio in my app (when I'm in the foreground)? Also when I do deactivate the audio session I get an error: AVAudioSessionErrorCodeIsBusy (but the button in the now playing center updates to the proper image). I do this : -(void)pause { [self.playerNode pause]; [self runOperationToDeactivateAudioSession]; // This does nothing on iOS: MPNowPlayingInfoCenter *nowPlayingCenter = [MPNowPlayingInfoCenter defaultCenter]; nowPlayingCenter.playbackState = MPNowPlayingPlaybackStatePaused; } So in -runOperationToDeactivateAudioSession I get the AVAudioSessionErrorCodeIsBusy. According to the documentation Starting in iOS 8, if the session has running I/Os at the time that deactivation is requested, the session will be deactivated, but the method will return NO and populate the NSError with the code property set to AVAudioSessionErrorCodeIsBusy to indicate the misuse of the API. So pausing the player node when pausing isn't enough to meet the deactivation criteria. I guess I have to pause or stop the audio engine. I could probably wait until I receive a scene went to background notification or something before deactivating my audio session (which is async, so the button may not update to the correct image in time). This seems like a lot of code to have to write to get a play-pause toggle to update, especially in iPad-multi window scene environment. What's the recommended approach? Should I pause the AudioEngine instead of the player node always? Should I always explicitly deactivate my audio session when the user pauses playback from my app's UI even if I'm in the foreground? I personally like the idea of just being able to set [MPNowPlayingInfoCenter defaultCenter].playbackState = MPNowPlayingPlaybackStatePaused; But maybe that's because that would just make things easier on me. This does feels overcomplicated though. If anyone can share some tips on how I should handle this, I'd appreciate it.
4
0
621
Feb ’25
Why is AVAudioEngine input giving all zero samples?
I am trying to get access to raw audio samples from mic. I've written a simple example application that writes the values to a text file. Below is my sample application. All the input samples from the buffers connected to the input tap is zero. What am I doing wrong? I did add the Privacy - Microphone Usage Description key to my application target properties and I am allowing microphone access when the application launches. I do find it strange that I have to provide permission every time even though in Settings > Privacy, my application is listed as one of the applications allowed to access the microphone. class AudioRecorder { private let audioEngine = AVAudioEngine() private var fileHandle: FileHandle? func startRecording() { let inputNode = audioEngine.inputNode let audioFormat: AVAudioFormat #if os(iOS) let hardwareSampleRate = AVAudioSession.sharedInstance().sampleRate audioFormat = AVAudioFormat(standardFormatWithSampleRate: hardwareSampleRate, channels: 1)! #elseif os(macOS) audioFormat = inputNode.inputFormat(forBus: 0) // Use input node's current format #endif setupTextFile() inputNode.installTap(onBus: 0, bufferSize: 1024, format: audioFormat) { [weak self] buffer, _ in self!.processAudioBuffer(buffer: buffer) } do { try audioEngine.start() print("Recording started with format: \(audioFormat)") } catch { print("Failed to start audio engine: \(error.localizedDescription)") } } func stopRecording() { audioEngine.stop() audioEngine.inputNode.removeTap(onBus: 0) print("Recording stopped.") } private func setupTextFile() { let tempDir = FileManager.default.temporaryDirectory let textFileURL = tempDir.appendingPathComponent("audioData.txt") FileManager.default.createFile(atPath: textFileURL.path, contents: nil, attributes: nil) fileHandle = try? FileHandle(forWritingTo: textFileURL) } private func processAudioBuffer(buffer: AVAudioPCMBuffer) { guard let channelData = buffer.floatChannelData else { return } let channelSamples = channelData[0] let frameLength = Int(buffer.frameLength) var textData = "" var allZero = true for i in 0..<frameLength { let sample = channelSamples[i] if sample != 0 { allZero = false } textData += "\(sample)\n" } if allZero { print("Got \(frameLength) worth of audio data on \(buffer.stride) channels. All data is zero.") } else { print("Got \(frameLength) worth of audio data on \(buffer.stride) channels.") } // Write to file if let data = textData.data(using: .utf8) { fileHandle!.write(data) } } }
4
0
849
Jan ’25
AUv3 recent "Failed to find component with type..." frequent issues
I've been generating new Audio Unit Extension apps with Xcode 16 (and newer), and although they generally work initially, it is easy (although I'm not sure how to do it reliably) to cause the app to no longer be able to instantiate the audiounit. Generally the call to AVAudioUnit.findComponent fails and SimplePlayEngine hits the fatalError("Failed to find component with type...") In the most recent project, merely adding files to the extension (without making any use of them) caused it to go off the rails. If I "Archive" the app+plugin, there is no audio unit extension in the bundle. If I switch to the audiounit extension and build it it's fine. If I look at the build folder in Library/Developer/Xcode/project_folder the extension_name.appex is there. Any ideas? If I can coax an unmodified audio unit extension project to exhibit this behavior I'll attach it here. Right now what I have has code I don't want to share.
4
1
663
Jan ’25
AVPlayer.replaceCurrentItem(with:) "Incorrect actor executor assumption" runtime crash when building for iOS 18
Hi there, I have some code that's been working fine for the last few versions of iOS and macOS and all the others, and now causes a runtime crash in iOS 18/macOS 15 etc. I have an actor called Player which is basically a big wrapper around an AVPlayer. It all gets compiled down to a Framework, and my clients use it by dropping it in to their video player app code. It handles everything needed for them to be able to talk to our media infrastructure and handles telemetry. It has its own property called avplayer which is an AVPlayer. Gets created at the init(). It has a function called load(_ avPlayerItem: AVPlayerItem) which the clients use to load a new video into player. The offending code (which used to work!) looks like this: Task { @MainActor in avplayer.replaceCurrentItem(with: avPlayerItem) } No warnings in Xcode. When you run it, it crashes on iOS 18 and macOS 15 with this error in the debugger: Incorrect actor executor assumption I thought, "Okay well maybe replaceCurrentItem has changed and doesn't need to be on the main actor anymore, so even if you say this outside of a Main Actor-scoped task: avplayer.replaceCurrentItem(with: avPlayerItem) ...it still crashes the exact same way. Does anyone have any ideas? I'm under some heavy pressure here to get this working and I don't even know where to start with this. Big thanks in advance.
4
0
851
Sep ’24
AudioConverterFillComplexBuffer not working for (E)AC3 in tvOS 18
Since upgrading to tvOS 18, the above function isn't working for me in converting a stream with these formats. It does work in decoding AAC, however. https://developer.apple.com/documentation/audiotoolbox/1503098-audioconverterfillcomplexbuffer?language=objc I pass a valid ioOutputDataPacketSize in, but it always comes out as zero. Has anyone else observed this too? I wonder if this is related to the issue being discussed widely about 5.1 sound being broken for many people after upgrading to tvOS 18? https://discussions.apple.com/thread/255769102?login=true&sortBy=rank EDIT: further information; the callback gets called once, asking for 1 packet (which is ok). I give it one packet and return noErr. However, after this, the callback is never invoked again. Must be a bug? EDIT2: the same code continues to work correctly on macOS in decoding the same audio stream.
4
2
660
Nov ’24
[26] audioTimeRange would still be interesting for .volatileResults in SpeechTranscriber
So experimenting with the new SpeechTranscriber, if I do: let transcriber = SpeechTranscriber( locale: locale, transcriptionOptions: [], reportingOptions: [.volatileResults], attributeOptions: [.audioTimeRange] ) only the final result has audio time ranges, not the volatile results. Is this a performance consideration? If there is no performance problem, it would be nice to have the option to also get speech time ranges for volatile responses. I'm not presenting the volatile text at all in the UI, I was just trying to keep statistics about the non-speech and the speech noise level, this way I can determine when the noise level falls under the noisefloor for a while. The goal here was to finalize the recording automatically, when the noise level indicate that the user has finished speaking.
4
0
288
Aug ’25
SoundRecognition causes Input/Output callbacks to have varying Buffer sizes and introduces Glitching
Hello, We have noticed an issue with SoundRecognition that causes glitching with our AudioUnit setup in Smule. Input and output frame sizes are inconsistent. Input frame size does not match [AVAudioSession sharedInstance].IOBufferDuration My best guess is that SoundRecognition influences the input frame size and not the output frame size. To reproduce use the example app here: https://github.com/MarkoGill/SoundRecognitionBug Hardware/OS iPhone 14 Pro on iOS 18 -> Experiences the problem iPhone 11 on iOS 18 -> Experiences the problem iPhone 15 on iOS 18 -> Not experiencing the problem Reproduction Steps Enable Sound Recognition (Settings > Accessibility > Sound Recognition > On) Enable a Sound for detection (Sounds > Dog > On) Open the example app with headset (it routes input to output) Notice glitching occurs Check the logs. Record and Playback buffer sizes vary Example Log: AU input sample rate: 48000.000000 AU output sample rate: 48000.000000 hardware sample rate: 48000.000000 hardware buffer size: 1104.000000 updated record frame counts: 1024 updated playback frame counts: 1104 Notes: You can disable Sound Recognition, restart the app, and playback behaves correctly.
4
1
992
Oct ’24
Constituent active device switching very slow on iPhone 16 Pro Models on focus changes
Hi all, we are in the business of scanning documents and barcodes with the camera system of mobile devices. Since there is a wide variety of use cases, from scanning tiniest barcodes and small business cards to scanning barcodes or large documents from far distances we preferably rely on the triple camera devices, if available, with automatic constituent device switching. This approach used to be working perfectly fine. Depending on the zoom level (we prefer to use an initial zoom value of 2.0) and the focusing distance the iPhone Pro models switched through the different camera systems at light speed: from ultra-wide to wide, tele and back. No issues at all. Unfortunately the new iPhone 16 Pro models behave very different when it comes to constituent device switching based on focus distance. The switching is slow and sometimes it does not happen at all when the focusing distance changes. Especially when aiming for a at a distant object for a longer time and then aiming at a very close object that is maybe 2" away. The iPhone 15 Pro here always switches immediately to the ultra-wide camera, while the iPhone 16 Pro takes at least 2-3 seconds, in rare cases up to 10 seconds and sometimes forever to switch to the ultra-wide camera. Of course we assumed that our code is responsible for these issues. So we experimented with restricting the devices and so on. Then we stripped more and more configuration code but nothing we tried improved the situation. So we ended up writing a minimal example app that demonstrates the problem. You can find the code below. Execute it on various iPhones and aim at far distance (> 10 feet) and then quickly to very close distance (<5 inches). Here is a list of devices and our test results: iPhone 15 Pro, iOS 17.6: very fast and reliable switching iPhone 15 Pro, iOS 18.1: very fast and reliable switching iPhone 13 Pro Max, iOS 15.3: very fast and reliable switching iPhone 16 (dual-wide camera), iOS 18.1: very fast and reliable switching iPhone 16 Pro, iOS 18.1: slow switching, unreliable iPhone 16 Pro Max, iOS 18.1: slow switching, unreliable Questions: Does anyone else have seen this issue? And possibly found a workaround? Is this behaviour intended on iPhone 16 Pro models? Can we somehow improve the switching speed? Further the iPhone 16 Pro models also show a jumping preview in the preview layer when they switch the constituent active device. Not dramatic, but compared to the other phones it looks like a glitch. Thank you very much! Kind regards, Sebastian import UIKit import AVFoundation class ViewController: UIViewController { var captureSession : AVCaptureSession! var captureDevice : AVCaptureDevice! var captureInput : AVCaptureInput! var previewLayer : AVCaptureVideoPreviewLayer! var activePrimaryConstituentToken: NSKeyValueObservation? var zoomToken: NSKeyValueObservation? override func viewDidLoad() { super.viewDidLoad() } override func viewDidAppear(_ animated: Bool) { super.viewDidAppear(animated) checkPermissions() setupAndStartCaptureSession() } func checkPermissions() { let cameraAuthStatus = AVCaptureDevice.authorizationStatus(for: AVMediaType.video) switch cameraAuthStatus { case .authorized: return case .denied: abort() case .notDetermined: AVCaptureDevice.requestAccess(for: AVMediaType.video, completionHandler: { (authorized) in if(!authorized){ abort() } }) case .restricted: abort() @unknown default: fatalError() } } func setupAndStartCaptureSession() { DispatchQueue.global(qos: .userInitiated).async{ self.captureSession = AVCaptureSession() self.captureSession.beginConfiguration() if self.captureSession.canSetSessionPreset(.photo) { self.captureSession.sessionPreset = .photo } self.captureSession.automaticallyConfiguresCaptureDeviceForWideColor = true self.setupInputs() DispatchQueue.main.async { self.setupPreviewLayer() } self.captureSession.commitConfiguration() self.captureSession.startRunning() self.activePrimaryConstituentToken = self.captureDevice.observe(\.activePrimaryConstituent, options: [.new], changeHandler: { (device, change) in let type = device.activePrimaryConstituent!.deviceType.rawValue print("Device type: \(type)") }) self.zoomToken = self.captureDevice.observe(\.videoZoomFactor, options: [.new], changeHandler: { (device, change) in let zoom = device.videoZoomFactor print("Zoom: \(zoom)") }) let switchZoomFactor = 2.0 DispatchQueue.main.async { self.setZoom(CGFloat(switchZoomFactor), animated: false) } } } func setupInputs() { if let device = AVCaptureDevice.default(.builtInTripleCamera, for: .video, position: .back) { captureDevice = device } else { fatalError("no back camera") } guard let input = try? AVCaptureDeviceInput(device: captureDevice) else { fatalError("could not create input device from back camera") } if !captureSession.canAddInput(input) { fatalError("could not add back camera input to capture session") } captureInput = input captureSession.addInput(input) } func setupPreviewLayer() { previewLayer = AVCaptureVideoPreviewLayer(session: captureSession) view.layer.addSublayer(previewLayer) previewLayer.frame = self.view.layer.frame } func setZoom(_ value: CGFloat, animated: Bool) { guard let device = captureDevice else { return } let maxZoom: CGFloat = captureDevice.maxAvailableVideoZoomFactor let minZoom: CGFloat = captureDevice.minAvailableVideoZoomFactor let zoomValue = max(min(value, maxZoom), minZoom) let deltaZoom = Float(abs(zoomValue - device.videoZoomFactor)) do { try device.lockForConfiguration() if animated { device.ramp(toVideoZoomFactor: zoomValue, withRate: max(deltaZoom * 50.0, 50.0)) } else { device.videoZoomFactor = zoomValue } device.unlockForConfiguration() } catch { return } } }
4
2
488
Dec ’24
MPNowPlayingInfoCenter nowPlayingInfo throttled
Hello, I have been running into issues with setting nowPlayingInfo information, specifically updating information for CarPlay and the CPNowPlayingTemplate. When I start playback for an item, I see lock screen information update as expected, along with the CarPlay now playing information. However, the playing items are books with collections of tracks. When I select a new track(chapter) within the book, I set the MPMediaItemPropertyTitle to the new chapter name. This change is reflected correctly on the lock screen, but almost never appears correctly on the CarPlay CPNowPlayingTemplate. The previous chapter title remains set and never updates. I see "Application exceeded audio metadata throttle limit." in the debug console fairly frequently. From that a I figured that I need to minimize updates to the nowPlayingInfo dictionary. What I did: I store the metadata dictionary in a local dictionary and only set values in the main nowPlayingInfo dictionary when they are different from the current value. I kick off the nowPlayingInfo update via a task that initially sleeps for around 2 seconds (not a final value, just for my current testing). If a previous Task is active, it gets cancelled, so that only one update can happen within that time window. Neither of these things have been sufficient. I can switch between different titles entirely and the information updates (including cover art). But when I switch chapters within a title, the MPMediaItemPropertyTitle continues to get dropped. I know the value is getting set, because it updates on the lock screen correctly. In total, I have 12 keys I update for info, though with the above changes, usually 2-4 of them actually get updated with high frequency. I am running out of ideas to satisfy the throttling thresholds to accurately display metadata. I could use some advice. Thanks.
4
1
120
May ’25
AVAudioSession's "availableInputs" not update in time
// Here addObserver for routeChangeNotification func testAudioRoute() { // My app is an VoIP app, so I need to set "playAndRecord" and "allowBluetooth" try? AVAudioSession.sharedInstance().setCategory(.playAndRecord, options: [.duckOthers, .allowBluetooth, .allowBluetoothA2DP]) NotificationCenter.default.addObserver(self, selector: #selector(currentRouteChanged(noti:)), name: AVAudioSession.routeChangeNotification, object: nil) } // Print the "availableInputs" once got a notification @objc func currentRouteChanged(noti: Notification) { let availableInputs = AVAudioSession.sharedInstance().availableInputs?.compactMap({ $0.portType }) ?? [] let currentRouteInputs = AVAudioSession.sharedInstance().currentRoute.inputs.compactMap({ $0.portType }) let currentRouteOutputs = AVAudioSession.sharedInstance().currentRoute.outputs.compactMap({ $0.portType }) print("willtest: \navailableInputs=\(availableInputs), \ncurrentRouteInputs=\(currentRouteInputs), \ncurrentRouteOutputs=\(currentRouteOutputs)") /* When BT (Airpods pro 2) CONNECTTED: it will print like below when notification comes, this is correct. ---------------------------------------------------------- willtest: availableInputs=[__C.AVAudioSessionPort(_rawValue: MicrophoneBuiltIn), __C.AVAudioSessionPort(_rawValue: BluetoothHFP)], currentRouteInputs=[], currentRouteOutputs=[__C.AVAudioSessionPort(_rawValue: BluetoothA2DPOutput)] ---------------------------------------------------------- When BT (Airpods pro 2) DISCONNECTTED: it will print like below when notification comes, this is wrong. ---------------------------------------------------------- availableInputs=[__C.AVAudioSessionPort(_rawValue: MicrophoneBuiltIn), __C.AVAudioSessionPort(_rawValue: BluetoothHFP)], currentRouteInputs=[], currentRouteOutputs=[__C.AVAudioSessionPort(_rawValue: Speaker)] */ } So my question here is: Why does the "availableInputs" still contain the "C.AVAudioSessionPort(_rawValue: BluetoothHFP)" item even though I have already disconnected the BT device? (Put AirPods in the case.) BTW, if I tap the "Manual" button once I disconnected the BT, it also prints the "wrong" value for "availableInputs", and it will become normal after about 3~4 seconds.
4
0
487
Dec ’24
How to reduce CMSampleBuffer volume
Hello, Basically, I am reading and writing an asset. To simplify, I am just reading the asset and rewriting it into an output video without any modifications. However, I want to add a fade-out effect to the last three seconds of the output video. I don’t know how to do this. So far, before adding the CMSampleBuffer to the output video, I tried reducing its volume using an extension on CMSampleBuffer. In the extension, I passed 0.4 for testing, aiming to reduce the video's overall volume by 60%. My question is: How can I directly adjust the volume of a CMSampleBuffer? Here is the extension: extension CMSampleBuffer { func adjustVolume(by factor: Float) -> CMSampleBuffer? { guard let blockBuffer = CMSampleBufferGetDataBuffer(self) else { return nil } var length = 0 var dataPointer: UnsafeMutablePointer<Int8>? guard CMBlockBufferGetDataPointer(blockBuffer, atOffset: 0, lengthAtOffsetOut: nil, totalLengthOut: &length, dataPointerOut: &dataPointer) == kCMBlockBufferNoErr else { return nil } guard let dataPointer = dataPointer else { return nil } let sampleCount = length / MemoryLayout<Int16>.size dataPointer.withMemoryRebound(to: Int16.self, capacity: sampleCount) { pointer in for i in 0..<sampleCount { let sample = Float(pointer[i]) pointer[i] = Int16(sample * factor) } } return self } }
4
0
371
May ’25
CallKit breaks web based MediaStreams
We're integrating a web based group calling application within a native iOS application and finding that every time a CallKit session gets fully established the web based media streams break, rendering as gray with no audio. Up to iOS 18 we worked around it by not fulfilling the call start action but that's no longer an option as the audio stopped getting automatically redirected to the speakers. We would now need the CXProvider's didActivateAudioSession callback but that would break the video. The sample project loads up a simple webpage in a WKWebView which contains a video tag streaming the media from the device's camera. At the same time it sets up a new CallKit session by requesting and fulfilling a CXStartCallAction transaction. You will notice that the media doesn't render and, if you are to follow the warnings we left, you will find that not fulfilling the CXStartCallAction fixes it. Unfortunately that's not a workaround we can use as we need the CXProvider delegate to inform us about audio session changes so we can redirect the audio to the speaker (so the proximity sensor doesn't activate and locking the screen doesn't end the call) Any insights or workarounds would be greatly appreciated.
4
1
923
Nov ’24