AVFoundation

RSS for tag

Work with audiovisual assets, control device cameras, process audio, and configure system audio interactions using AVFoundation.

Posts under AVFoundation tag

200 Posts

Post

Replies

Boosts

Views

Activity

Massive CoreML latency spike on live AVFoundation camera feed vs. offline inference (CPU+ANE)
Hello, I’m experiencing a severe performance degradation when running CoreML models on a live AVFoundation video feed compared to offline or synthetic inference. This happens across multiple models I've converted (including SCI, RTMPose, and RTMW) and affects multiple devices. The Environment OS: macOS 26.3, iOS 26.3, iPadOS 26.3 Hardware: Mac14,6 (M2 Max), iPad Pro 11 M1, iPhone 13 mini Compute Units: cpuAndNeuralEngine The Numbers When testing my SCI_output_image_int8.mlpackage model, the inference timings are drastically different: Synthetic/Offline Inference: ~1.34 ms Live Camera Inference: ~15.96 ms Preprocessing is completely ruled out as the bottleneck. My profiling shows total preprocessing (nearest-neighbor resize + feature provider creation) takes only ~0.4 ms in camera mode. Furthermore, no frames are being dropped. What I've Tried I am building a latency-critical app and have implemented almost every recommended optimization to try and fix this, but the camera-feed penalty remains: Matched the AVFoundation camera output format exactly to the model input (640x480 at 30/60fps). Used IOSurface-backed pixel buffers for everything (camera output, synthetic buffer, and resize buffer). Enabled outputBackings. Loaded the model once and reused it for all predictions. Configured MLModelConfiguration with reshapeFrequency = .frequent and specializationStrategy = .fastPrediction. Wrapped inference in ProcessInfo.processInfo.beginActivity(options: .latencyCritical, reason: "CoreML_Inference"). Set DispatchQueue to qos: .userInteractive. Disabled the idle timer and enabled iOS Game Mode. Exported models using coremltools 9.0 (deployment target iOS 26) with ImageType inputs/outputs and INT8 quantization. Reproduction To completely rule out UI or rendering overhead, I wrote a standalone Swift CLI script that isolates the AVFoundation and CoreML pipeline. The script clearly demonstrates the ~15ms latency on live camera frames versus the ~1ms latency on synthetic buffers. (I have attached camera_coreml_benchmark.swift and coreml model (very light low light enghancement model) to this repo on github https://github.com/pzoltowski/apple-coreml-camera-latency-repro). My Question: Is this massive overhead expected behavior for AVFoundation + Core ML on live feeds, or is this a framework/runtime bug? If expected, what is the Apple-recommended pattern to bypass this camera-only inference slowdown? One think found interesting when running in debug model was faster (not as fast as in performance benchmark but faster than 16ms. Also somehow if I did some dummy calculation on on different DispatchQueue also seems like model got slightly faster. So maybe its related to ANE Power State issues (Jitter/SoC Wake) and going to fast to sleep and taking a long time to wakeup? Doing dummy calculation in background thought is probably not a solution. Thanks in advance for any insights!
3
0
122
9h
AVSpeechSynthesizer read Mandarin as Cantonese(iOS 26 beta 3))
In iOS 26, AVSpeechSynthesizer read Mandarin into Cantonese pronunciation. No matter how you set the language, and change the settings of my phone system, it doesn't work. let utterance = AVSpeechUtterance(string: "你好啊") //let voice = AVSpeechSynthesisVoice(language: "zh-CN") // not work let voice = AVSpeechSynthesisVoice(language: "zh-Hans") // not work too utterance.voice = voice et synth = AVSpeechSynthesizer() synth.speak(utterance)
3
0
467
9h
On iOS 18, Mandarin is read aloud as Cantonese
Please include the line below in follow-up emails for this request. Case-ID: 11089799 When using AVSpeechUtterance and setting it to play in Mandarin, if Siri is set to Cantonese on iOS 18, it will be played in Cantonese. There is no such issue on iOS 17 and 16. 1.let utterance = AVSpeechUtterance(string: textView.text) let voice = AVSpeechSynthesisVoice(language: "zh-CN") utterance.voice = voice 2.In the phone settings, Siri is set to Cantonese
4
1
699
1d
Async AVAudioPlayerNode.scheduleBuffer stutters
My code that streams buffers into AVAudioPlayerNode is stuttering when the buffer is finished and before the next one is played. while engine.isRunning { let framesToCopy = min(buffer.frameLength - framePosition, Self.BufferSize) let srcRaw = UnsafeRawPointer(srcPtr) let playbackBuffer = AVAudioPCMBuffer(pcmFormat: buffer.format, frameCapacity: Self.BufferSize)! let playbackPtr = playbackBuffer.floatChannelData![0] let destRaw = UnsafeMutableRawPointer(mutating: playbackPtr) memcpy(destRaw, srcRaw, Int(framesToCopy) * MemoryLayout<Float>.stride) srcPtr = srcPtr.advanced(by: Int(framesToCopy)) playbackBuffer.frameLength = framesToCopy await player.scheduleBuffer(playbackBuffer, at: nil, options: [], completionCallbackType: .dataRendered) } I've tried to schedule multiple buffers at once using a combination of both the synchronous and async versions of scheduleBuffer because I thought the delay might be but it still stutters and the data copied into the playbackBuffer matches the source buffer. I've tried all combinations of options and completionCallbackType but no luck. I've tried increasing the buffer size but that just spaces out the stutters because the buffer is larger. What am I missing about this API?
0
0
41
5d
AudioQueueNewOutput blocks indefinitely on iOS 18.3 (hangs during creation)
Hi everyone, We’re encountering an issue where AudioQueueNewOutput blocks indefinitely and never returns, and we’re hoping to get some insight or confirmation if this is a known behavior/regression on newer iOS versions. Issue Description When triggering audio playback, we create an output AudioQueue using AudioQueueNewOutput. On some devices, the call hangs inside AudioQueueNewOutput and never returns, with no OSStatus error and no subsequent logs. This behavior is reproducible mainly on iOS 18.3. Earlier iOS versions do not show this issue under the same code path. if (audioDes) { mAudioDes.mSampleRate = audioDes->mSampleRate; mAudioDes.mBitsPerChannel = audioDes->mBitsPerChannel; mAudioDes.mChannelsPerFrame = audioDes->mChannelsPerFrame; mAudioDes.mFormatID = audioDes->mFormatID; mAudioDes.mFormatFlags = audioDes->mFormatFlags; mAudioDes.mFramesPerPacket = audioDes->mFramesPerPacket; mAudioDes.mBytesPerFrame = audioDes->mBytesPerFrame; mAudioDes.mBytesPerPacket = audioDes->mBytesPerFrame; mAudioDes.mReserved = 0; } // Create AudioQueue for output OSStatus status = AudioQueueNewOutput( &mAudioDes, AQOutputCallback, this, NULL, NULL, 0, &audioQueue ); code-block The thread blocks inside AudioQueueNewOutput, and execution never reaches the next line. Additional Notes / Observations ASBD is confirmed to be valid Standard PCM output Sample rate, channels, bytes per frame/packet all consistent Same ASBD works correctly on earlier iOS versions AudioQueue is created on a background thread Not on the main thread Not inside the AudioQueue callback On first creation, AVAudioSession may not yet be active setCategory and setActive:YES may be called shortly before creating the AudioQueue There may be a timing window where the session is still activating Issue is reported mainly on iOS 18.3 Multiple user reports point to iOS 18.3 devices Same code path works on iOS 17.x and earlier No OSStatus error is returned — the call simply never returns. Questions Is it expected that AudioQueueNewOutput can block indefinitely while waiting for AVAudioSession / audio route / HAL readiness? Have there been any behavior changes in iOS 18.3 regarding AudioQueue creation or AudioSession synchronization? Is it unsafe to call AudioQueueNewOutput before AVAudioSession is fully active on recent iOS versions? Are there recommended patterns (or delays / callbacks) to ensure AudioQueue creation does not hang? Any insight or confirmation would be greatly appreciated. Thanks in advance!
0
0
42
5d
Inquiry regarding CoreMediaErrorDomain Code=-15517 during LL-HLS Live Playback
Hello, I am currently developing a live streaming application using AVPlayer to play LL-HLS (Low-Latency HLS) content. During our testing phase, we consistently encountered the following error in the logs: CoreMediaErrorDomain Code=-15517 The challenge we are facing is that the error description is quite vague. It only provides cryptic messages such as "Key not found" or "No value information," which makes it extremely difficult to identify the root cause or perform a deep-dive analysis. I have searched through the official Apple Developer documentation and technical notes, but I couldn’t find any specific reference to what Code -15517 signifies in the context of LL-HLS or CoreMedia. Regarding this issue, I have the following questions: What is the specific meaning of this error code (-15517)? Does it relate to missing tags in the HLS manifest, or is it an internal state issue within the AVPlayer stack? Specifically, I would like to know if this is a critical error that disrupts playback, or if it is just a warning that can be safely ignored. Is there any additional logging or debugging tool you would recommend to further investigate "Key not found" issues in LL-HLS? Any insights or guidance from the community or Apple engineers would be greatly appreciated. Thank you in advance for your help.
1
0
107
6d
UVC over MFi – Is there official support? Implementation guidance?
Hello everyone, I’m looking for more detailed information regarding UVC (USB Video Class) over MFi within the Apple ecosystem and would appreciate some clarification. I’m interested in developing (or interfacing with) an accessory that transmits video over USB using the UVC standard, and I’d like to better understand how this works within the MFi (Made for iPhone) program. Here are my main questions: 1. Do iOS devices provide native support for UVC over USB-C or Lightning within the MFi framework? 2. Are there any specific firmware or authentication requirements when the accessory is MFi-certified? 3. Does UVC support depend solely on the hardware interface (USB-C vs Lightning), or are there additional software-level requirements? 4. Is there any official documentation outlining the recommended flow for implementing UVC-based video capture accessories on iOS? From what I understand, USB-C iPads appear to offer more direct support for standard UVC devices, but it’s not entirely clear how this integrates with the MFi ecosystem with iOS, especially for commercial product development. If anyone has gone through this process or can point me to relevant technical documentation, I would greatly appreciate the guidance. Thank you!
1
0
144
1w
Monitors from Dell not fully-integrated with MacOS keyboard control
I just bought a monitor S2725QC from Dell Technologies and isn't fully-integrated with MacOS even though it says on the website it is compatible with MacOS. https://www.dell.com/en-us/shop/all-monitors/sac/monitors/all-monitors/macos-compatible?appliedRefinements=51765 The screen brightness and volume control buttons don't work with the monitors (I have two). What can I do in terms of writing code with Dell Monitor SDK and MacOS Frameworks/Technologies?
1
0
83
1w
Video in "Made for iPad" apps on macOS
I'm relatively new to Swift development (and native iOS development for that matter) I've got an iOS app that uses the iPhone / iPad built in cameras, and am looking to make this more compatible with macOS. Using the normal AVCaptureDevice.DiscoverySession I seem to get the iPhone Continuity Camera and the in-built MacBook Pro camera but I don't see other input devices that I see in QuickTime Player (for example) such as connected external cameras or Virtual Inputs provided by NDI Virtual Input and OBS. Is there a way to see access these without a specific Mac build (as the rest of the functionality works great, and I'd rather not diverge the codebase too much as it's easier to update one app than two!
0
0
93
2w
AVAudioSession.outputVolume does not reflect system volume changes made while app is in background
I have a question regarding the behavior of AVAudioSession.sharedInstance().outputVolume. Observed behavior: When the app is in the foreground, I read audioSession.outputVolume (for example, 0.1). The app is then moved to the background. While the app is in the background, the user changes the system volume using the hardware buttons (for example, to 0.5). When the app returns to the foreground, audioSession.outputVolume still reports the previous value (0.1). From my testing, outputVolume only seems to update when the system volume is changed while the app is in the foreground. Volume changes made while the app is in the background are not reflected when the app returns to the foreground. Questions: According to Apple’s documentation for AVAudioSession.outputVolume: “The systemwide output volume set by the user.” https://developer.apple.com/documentation/avfaudio/avaudiosession/outputvolume However, based on our testing on iOS 18.6.2 and iOS 18.1, the observed behavior seems to differ from this description. Questions: The documentation states that outputVolume represents the system-wide volume set by the user. In our testing, the value does not reflect volume changes made while the app is in the background and only updates when the app is in the foreground.Is this the expected behavior of AVAudioSession.outputVolume? Is there any other recommended way in Swift to retrieve the current system volume that reflects user changes made both while the app is in the foreground and while it is in the background? Any clarification on the intended behavior or recommended handling would be greatly appreciated.
1
1
174
2w
Keeping PiP alive during third-party video recording (camera capture)
I’m building a teleprompter-style app that relies on Picture in Picture. PiP starts correctly on device. Everything works — until another app (e.g. TikTok / Instagram) starts active video recording. When camera capture begins in the foreground app, iOS terminates my PiP session. Some teleprompter apps appear to keep PiP active while recording in other apps, so I’m trying to understand the recommended architectural pattern for this scenario. Is there a documented approach or best practice to keep PiP stable during third-party camera capture? Looking specifically for guidance on the correct AVKit / AVAudioSession configuration for this use case.
0
0
171
2w
Crash when trying to get originatingRecipient
According to the documentation (https://developer.apple.com/documentation/avfoundation/avcontentkeyrequest/originatingrecipient?changes=_3&language=objc), starting with ios 18.4, I can get AVContentKeyRecipient from AVContentKeyRequest. But when I try to get it, I get a crash. What could be the issue? I want to note that I add the asset to the AVContentKeySession using the addContentKeyRecipient method (https://developer.apple.com/documentation/avfoundation/avcontentkeysession/addcontentkeyrecipient(_:)?changes=_3&language=objc).
1
0
186
2w
VideoMaterial Black Screen on Vision Pro Device (Works in Simulator)
VideoMaterial Black Screen on Vision Pro Device (Works in Simulator) App Overview App Name: Extn Browser Bundle ID: ai.extn.browser Purpose: A visionOS web browser that plays 360°/180° VR videos in an immersive sphere environment Development Environment & SDK Versions Component Version Xcode 26.2 Swift 6.2 visionOS Deployment Target 26.2 Swift Concurrency MainActor isolation enabled App is released in the TestFlight. Frameworks Used SwiftUI - UI framework RealityKit - 3D rendering, MeshResource, ModelEntity, VideoMaterial AVFoundation - AVPlayer, AVAudioSession WebKit - WKWebView for browser functionality Network - NWListener for local proxy server Sphere Video Mechanism The app creates an immersive 360° video experience using the following approach: // 1. Create sphere mesh (10 meter radius for immersive viewing) let mesh = MeshResource.generateSphere(radius: 10.0) // 2. Create initial transparent material var material = UnlitMaterial() material.color = .init(tint: .clear) // 3. Create entity and invert sphere (negative X scale) let sphere = ModelEntity(mesh: mesh, materials: [material]) sphere.scale = SIMD3<Float>(-1, 1, 1) // Inverts normals for inside-out viewing sphere.position = SIMD3<Float>(0, 1.5, 0) // Eye level // 4. Create AVPlayer with video URL let player = AVPlayer(url: videoURL) // 5. Configure audio session for visionOS let audioSession = AVAudioSession.sharedInstance() try audioSession.setCategory(.playback, mode: .moviePlayback, options: [.mixWithOthers]) try audioSession.setActive(true) // 6. Create VideoMaterial and apply to sphere let videoMaterial = VideoMaterial(avPlayer: player) if var modelComponent = sphere.components[ModelComponent.self] { modelComponent.materials = [videoMaterial] sphere.components.set(modelComponent) } // 7. Start playback player.play() ImmersiveSpace Configuration // browserApp.swift ImmersiveSpace(id: appModel.immersiveSpaceID) { ImmersiveView() .environment(appModel) } .immersionStyle(selection: .constant(.mixed), in: .mixed) Entitlements <!-- browser.entitlements --> <key>com.apple.security.app-sandbox</key> <true/> <key>com.apple.security.network.client</key> <true/> <key>com.apple.security.network.server</key> <true/> Info.plist Network Configuration <key>NSAppTransportSecurity</key> <dict> <key>NSAllowsArbitraryLoads</key> <true/> </dict> The Issue Behavior in Simulator: Video plays correctly on the inverted sphere surface - 360° video is visible and wraps around the user as expected. Behavior on Physical Vision Pro: The sphere displays a black screen. No video content is visible, though the sphere entity itself is present. Important: Not a DRM/Licensing Issue This issue is NOT related to Digital Rights Management (DRM) or FairPlay. I have tested with: Unlicensed raw MP4 video files (no DRM protection) Self-hosted video content with no copy protection Direct MP4 URLs from CDN without any licensing requirements The same black screen behavior occurs with all unprotected video sources, ruling out DRM as the cause. (Plain H.264 MP4, no DRM) Screen Recording: Working in Simulator The following screen recording demonstrates playing a 360° YouTube video in the immersive sphere on the visionOS Simulator: https://cdn.commenda.kr/screen-001.mov This confirms that the VideoMaterial and sphere rendering work correctly in the simulator, but the same setup shows a black screen on the physical Vision Pro device. Observations AVPlayer status reports .readyToPlay - The video appears to load successfully VideoMaterial is created without errors - No exceptions thrown Sphere entity renders - The geometry is visible (black surface) Audio session is configured - No errors during audio session setup Network requests succeed - The video URL is accessible from the device Same result with local/unprotected content - DRM is not a factor Console Logs (Device) The logging shows: Sphere created and added to scene AVPlayer created with correct URL VideoMaterial created and applied Player status transitions to .readyToPlay player.play() called successfully Rate shows 1.0 (playing) Despite all success indicators, the rendered output is black. Questions for Apple Are there known differences in VideoMaterial behavior between the visionOS Simulator and physical Vision Pro hardware? Does VideoMaterial(avPlayer:) require specific video codec/format requirements that differ on device? (The test video is a standard H.264 MP4) Is there a required Metal capability or GPU feature for VideoMaterial that may not be available in certain contexts on device? Does the immersion style (.mixed) affect VideoMaterial rendering on hardware? Are there additional entitlements required for video texture rendering in RealityKit on physical hardware? Attempted Solutions Configured AVAudioSession with .playback category Added delay before player.play() to ensure material is applied Verified sphere scale inversion (-1, 1, 1) Tested multiple video URLs (including raw, unlicensed MP4 files) Confirmed network connectivity on device Ruled out DRM/FairPlay issues by testing unprotected content Environment Details Device: Apple Vision Pro visionOS Version: 26.2 Xcode Version: 26.2 macOS Version: Darwin 25.2.0
0
0
219
3w
Missing "Dolby Vision Profile" Option in Deliver Page - DaVinci Resolve 20 on iPadOS 26
Dear Support Team, ​I am writing to seek technical assistance regarding a persistent issue with Dolby Vision exporting in DaVinci Resolve 20 on my iPad Pro 12.9-inch (2021, M1 chip) running iPadOS 26.0.1. ​The Issue: Despite correctly configuring the project for a Dolby Vision workflow and successfully completing the dynamic metadata analysis, the "Dolby Vision Profile" dropdown menu (and related embedding options) is completely missing from the Advanced Settings in the Deliver page. ​My Current Configuration & Steps Taken: ​Software Version: DaVinci Resolve Studio 20 (Studio features like Dolby Vision analysis are active and functional). ​Project Settings: Color Science: DaVinci YRGB Color Managed. ​Dolby Vision: Enabled (Version 4.0) with Mastering Display set to 1000 nits. ​Output Color Space: Rec.2100 ST2084. ​Color Page: Dynamic metadata analysis has been performed, and "Trim" controls are functional. ​Export Settings: ​Format: QuickTime / MP4. ​Codec: H.265 (HEVC). ​Encoding Profile: Main 10. ​The Problem: Under "Advanced Settings," there is no option to select a Dolby Vision Profile (e.g., Profile 8.4) or to "Embed Dolby Vision Metadata." ​Potential Variables: ​System Version: I am currently running iPadOS 26. ​Apple ID: My iPad is currently not logged into an Apple ID. I suspect this might be preventing the app from accessing certain system-level AVFoundation frameworks or Dolby DRM/licensing certificates required for metadata embedding. ​Could you please clarify if the "Dolby Vision Profile" option is dependent on a signed-in Apple ID for hardware-level encoding authorization, or if this is a known compatibility issue with the current iPadOS 26 build? ​I look forward to your guidance on how to resolve this. ​Best regards, INSOFT_Fred
0
0
122
3w
Inquiry about Low-Latency Frame Interpolation & Super Resolution using VTFrameProcessor
Hello, I have implemented Low-Latency Frame Interpolation using the VTFrameProcessor framework, based on the sample code from https://developer.apple.com/kr/videos/play/wwdc2025/300. It is currently working well for both LIVE and VOD streams. However, I have a few questions regarding the lifecycle management and synchronization of this feature: 1. Common Questions (Applicable to both Frame Interpolation & Super Resolution) 1.1 Dynamic Toggling Do you recommend enabling/disabling these features dynamically during playback? Or is it better practice to configure them only during the initial setup/preparation phase? If dynamic toggling is supported, are there any recommended patterns for managing VTFrameProcessor session lifecycle (e.g., startSession / endSession timing)? 1.2 Synchronization Method I am currently using CADisplayLink to fetch frames from AVPlayerItemVideoOutput and perform processing. Is CADisplayLink the recommended approach for real-time frame acquisition with VTFrameProcessor? If the feature needs to be toggled on/off during active playback, are there any concerns or alternative approaches you would recommend? 1.3 Supported Resolution/Quality Range What are the minimum and maximum video resolutions supported for each feature? Are there any aspect ratio restrictions (e.g., does it support 1:1 square videos)? Is there a recommended resolution range for optimal performance and quality? 2. Frame Interpolation Specific Questions 2.1 LIVE Stream Support Is Low-Latency Frame Interpolation suitable for LIVE streaming scenarios where latency is critical? Are there any special considerations for LIVE vs VOD? 3. Super Resolution Specific Questions 3.1 Adaptive Bitrate (ABR) Stream Support In ABR (HLS/DASH) streams, the video resolution can change dynamically during playback. Is VTLowLatencySuperResolutionScaler compatible with ABR streams where resolution changes mid-playback? If resolution changes occur, should I recreate the VTLowLatencySuperResolutionScalerConfiguration and restart the session, or does the API handle this automatically? 3.2 Small/Square Resolution Issue I observed that 144x144 (1:1 square) videos fail with error:   "VTFrameProcessorErrorDomain Code=-19730: processWithSourceFrame within VCPFrameSuperResolutionProcessor failed" However, 480x270 (16:9) videos work correctly. minimumDimensions reports 96x96, but 144x144 still fails. Is there an undocumented restriction on aspect ratio or a practical minimum resolution? 3.3 Scale Factor Selection supportedScaleFactors returns [2.0, 4.0] for most resolutions. Is there a recommended scale factor for balancing quality and performance? Are there scenarios where 4.0x should be avoided? The documentation on this specific topic seems limited, so I would appreciate any insights or advice. Thank you.
0
0
337
3w
AVCam Sample Code - Undesired "Jump" in Video Recording Image
On iPhone 16 Pro Max (not tested other devices) there's a noticeable jump in the framing of the preview video when you record in the iOS AVCam Sample App. The same jump in camera framing can be observed by switching to the front facing camera and then back to the rear one. It looks roughly consistent with switching between the 0.5x and 1x camera (but not quite a match for the same viewable area in the Camera app) - and it's only when it's initially loaded, once recording is started it retains the 'closer' image no matter how many times it's stopped/started thereafter. I'm relatively new to Swift and haven't done anything with the camera before, so odd 'buggy' behaviour in the sample code isn't helping me understand it! :-) Is there any way to fix this?
0
0
256
Jan ’26
Are White Balance gains applied before or after ADC?
At which point in the image processing pipeline does iOS apply the white balance gains which can be set via AVCaptureDevice.setWhiteBalanceModeLocked(with:completionHandler:)? Are those gains applied in the analog part of the camera pipeline, before the pixel voltage gets converted via the ADC to digital values? Or does the camera first convert the pixel voltages to digital values and then the gains are applied to the digital values? Is this consistent across devices or can the behavior vary from device to device?
1
0
342
Jan ’26