When using AVSampleBufferDisplayLayer to play uncompressed H.264 and H.265 video with B-frames more than 7, frame drops occur. The more B-frames there are, the more noticeable the frame drops become, for example 15 bframes.
Use FFmpeg to transcode a video file with visible timestamps and frame numbers (x264 or x265 ):
ffmpeg -i test.mp4 -vf "drawtext=fontsize=45:text=%{pts} %{n}:y=400" -c:v libx264 -x264-params "bframes=15:b-adapt=0" -crf 30 -y x264_bf15.mp4
ffmpeg -i test.mp4 -vf "drawtext=fontsize=45:text=%{pts} %{n}:y=400" -c:v libx265 -x265-params "bframes=15:b-adapt=0" -crf 30 -y x265_bf15.mp4
Use the demo player from this repository to reproduce the issue: https://github.com/msfrms/CustomPlayer
frame drops can be observed. And following log can be found in devices console.
mediaserverd <<<< IQ-CA >>>> piqca_gmstats_dump: FIQCA(0x1266f4000) recent frames: enqueued: 184, displayed: 138, dropped: 42, flushed: 0, evicted: 3, >16ms late: 2
PS. I was using iphone11 iOS14.6, to replay this issue.
May I ask why frame drops occur in this case?
Is there any configuration or API usage change that could help fix the frame drop issue?
Many thanks!
Video
RSS for tagDive into the world of video on Apple platforms, exploring ways to integrate video functionalities within your iOS,iPadOS, macOS, tvOS, visionOS or watchOS app.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
Hi everyone,
We are working on a prototype app for Apple Vision Pro that is similar in functionality to Omegle or Chatroulette, but exclusively for Vision Pro owners.
The core idea is:
– a matching system where one user connects to another through a virtual persona;
– real-time video and audio transmission;
– time limits for sessions with the ability to extend them;
– users can skip a match and move on to the next one.
We have explored WebRTC and Twilio, but unfortunately, they don’t fit our use case.
Question:
What alternative services or SDKs are available for implementing real-time video/audio communication on Vision Pro that would work with this scenario?
Has anyone encountered a similar challenge and can recommend which technologies or tools to use?
Thanks in advance!
I'm writing some camera functionality that uses AVCaptureVideoDataOutput.
I've set it up so that it calls my AVCaptureVideoDataOutputSampleBufferDelegate on a background thread, by making my own dispatch_queue and configuring the AVCaptureVideoDataOutput.
My question is then, if I configure my AVCaptureSession differently, or even stop it altogether, is this guaranteed to flush all pending jobs on my background thread? For example, does [AVCaptureSession stopRunning] imply a blocking call until all pending frame-callbacks are done?
I have a more practical example below, showing how I am accessing something from the foreground thread from the background thread, but I wonder when/how it's safe to clean up that resource.
I have setup similar to the following:
// Foreground thread logic
dispatch_queue_t queue = dispatch_queue_create("qt_avf_camera_queue", nullptr);
AVCaptureSession *captureSession = [[AVCaptureSession alloc] init];
setupInputDevice(captureSession); // Connects the AVCaptureDevice...
// Store some arbitrary data to be attached to the frame, stored on the foreground thread
FrameMetaData frameMetaData = ...;
MySampleBufferDelegate *sampleBufferDelegate = [MySampleBufferDelegate alloc];
// Capture frameMetaData by reference in lambda
[sampleBufferDelegate setFrameMetaDataGetter: [&frameMetaData]() { return &frameMetaData; }];
AVCaptureVideoDataOutput *captureVideoDataOutput = [[AVCaptureVideoDataOutput alloc] init];
[captureVideoDataOutput setSampleBufferDelegate:sampleBufferDelegate
queue:queue];
[captureSession addOutput:captureVideoDataOutput];
[captureSession startRunning];
[captureSession stopRunning];
// Is it now safe to destroy frameMetaData, or do we need manual barrier?
And then in MySampleBufferDelegate:
- (void)captureOutput:(AVCaptureOutput *)captureOutput
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
fromConnection:(AVCaptureConnection *)connection
{
// Invokes the callback set above
FrameMetaData *frameMetaData = frameMetaDataGetter();
emitSampleBuffer(sampleBuffer, frameMetaData);
}
Hello,
Environment
macOS 15.6.1 / Xcode 26 beta 7 / iOS 26 Beta 9
In a simple AVFoundation video-playback sample, I’m seeing different behavior between iOS 18 and iOS 26 regarding AVPlayerItem.didPlayToEndTimeNotification.
I’ve attached a minimal sample below. Please replace videoURL with a valid short video URL.
Repro steps
Tap “Play” to start playback and let the video finish.
The AVPlayerItem.didPlayToEndTimeNotification registered with NotificationCenter should fire, and you should see Play finished. in the console.
Without relaunching, tap “Play” again. This is where the issue arises.
Observed behavior
On iOS 18 and earlier: The video does not play again (it does not restart from the beginning), but AVPlayerItem.didPlayToEndTimeNotification is posted and Play finished. appears in the console. The same happens every time you press “Play”.
On iOS 26: Pressing “Play” does not post AVPlayerItem.didPlayToEndTimeNotification. The code path that prints Play finished. is never called (the callback enclosing that line is not invoked again).
Building the same program with Xcode 16.4 and running it on an iOS 26 beta device shows the same phenomenon, which suggests there has been a behavioral change for AVPlayerItem.didPlayToEndTimeNotification on iOS 26. I couldn’t find any mention of this in the release notes or API Reference.
Because the semantics around AVPlayerItem.didPlayToEndTimeNotification appear to differ, we’re forced to adjust our logic. If there is a way to achieve the iOS 18–style behavior on iOS 26, I would appreciate guidance.
Alternatively, if this change is intentional, could you share the reasoning? Is iOS 26 the correct behavior from Apple’s perspective and iOS 18 (and earlier) behavior considered incorrect? Any official clarification would be extremely helpful.
import UIKit
import AVFoundation
final class ViewController: UIViewController {
private let videoURL = URL(string: "https://......mp4")!
private var player: AVPlayer?
private var playerItem: AVPlayerItem?
private var playerLayer: AVPlayerLayer?
private var observeForComplete: NSObjectProtocol?
// UI
private let playerContainerView = UIView()
private let playButton = UIButton(type: .system)
private let stopButton = UIButton(type: .system)
private let replayButton = UIButton(type: .system)
deinit {
if let observeForComplete {
NotificationCenter.default.removeObserver(observeForComplete)
}
}
override func viewDidLoad() {
super.viewDidLoad()
view.backgroundColor = .systemBackground
setupUI()
setupPlayer()
}
override func viewDidLayoutSubviews() {
super.viewDidLayoutSubviews()
playerLayer?.frame = playerContainerView.bounds
}
// MARK: - Setup
private func setupUI() {
playerContainerView.translatesAutoresizingMaskIntoConstraints = false
playerContainerView.backgroundColor = .black
view.addSubview(playerContainerView)
// Buttons
playButton.setTitle("Play", for: .normal)
stopButton.setTitle("Pause", for: .normal)
replayButton.setTitle("RePlay", for: .normal)
[playButton, stopButton, replayButton].forEach {
$0.titleLabel?.font = .systemFont(ofSize: 16, weight: .semibold)
$0.translatesAutoresizingMaskIntoConstraints = false
$0.contentEdgeInsets = UIEdgeInsets(top: 10, left: 16, bottom: 10, right: 16)
}
let stack = UIStackView(arrangedSubviews: [playButton, stopButton, replayButton])
stack.axis = .horizontal
stack.spacing = 16
stack.alignment = .center
stack.distribution = .equalCentering
stack.translatesAutoresizingMaskIntoConstraints = false
view.addSubview(stack)
NSLayoutConstraint.activate([
playerContainerView.topAnchor.constraint(equalTo: view.safeAreaLayoutGuide.topAnchor, constant: 20),
playerContainerView.leadingAnchor.constraint(equalTo: view.leadingAnchor),
playerContainerView.trailingAnchor.constraint(equalTo: view.trailingAnchor),
playerContainerView.heightAnchor.constraint(equalToConstant: 200),
stack.topAnchor.constraint(equalTo: playerContainerView.bottomAnchor, constant: 20),
stack.centerXAnchor.constraint(equalTo: view.centerXAnchor)
])
// Action
playButton.addTarget(self, action: #selector(didTapPlay), for: .touchUpInside)
stopButton.addTarget(self, action: #selector(didTapStop), for: .touchUpInside)
replayButton.addTarget(self, action: #selector(didTapReplayFromStart), for: .touchUpInside)
}
private func setupPlayer() {
// AVURLAsset -> AVPlayerItem → AVPlayer
let asset = AVURLAsset(url: videoURL)
let item = AVPlayerItem(asset: asset)
self.playerItem = item
let player = AVPlayer(playerItem: item)
player.automaticallyWaitsToMinimizeStalling = true
self.player = player
let layer = AVPlayerLayer(player: player)
layer.videoGravity = .resizeAspect
playerContainerView.layer.addSublayer(layer)
layer.frame = playerContainerView.bounds
self.playerLayer = layer
// Notification
if let observeForComplete {
NotificationCenter.default.removeObserver(observeForComplete)
}
if let playerItem {
observeForComplete = NotificationCenter.default.addObserver(
forName: AVPlayerItem.didPlayToEndTimeNotification,
object: playerItem,
queue: .main
) { [weak self] _ in
guard self != nil else { return }
Task { @MainActor in
print("Play finished.")
}
}
}
}
// MARK: - Actions
@objc private func didTapPlay() {
player?.play()
}
@objc private func didTapStop() {
player?.pause()
}
// RePlay
@objc private func didTapReplayFromStart() {
player?.seek(to: .zero, toleranceBefore: .zero, toleranceAfter: .zero) { [weak self] _ in
self?.player?.play()
}
}
}
I would greatly appreciate an official response from Apple engineering on whether this is an intentional change, a regression, or an API contract clarification, and what the recommended approach is going forward. Thank you.
We have been using passthrough in screen capture since visionOS26 with broadcast upload extension which was working in visionOS2.2 but now with visionOS26 it doesn't update. It fails with Invalid Broadcast session started, after a few seconds of starting the broadcast session.
Is there a bug filed for it? or is it a known bug for it?
iPhoneで撮影した映像をブラウザのアプリへ送信して画面に映す機能を持ったアプリを開発しています。
iPhone 17 Pro, 17 Pro Maxでこのアプリを利用するとブラウザ側に表示される映像が緑一色や、緑がメインのカラフルな映像になってしまいます。
調べてみると17Proと17ProMaxで超広角カメラと望遠カメラの画素数が変更になっている(1200万画素→4800万画素)ためエンコーディングで失敗しているのではないかと疑っています。
なんでも情報下さい。
環境情報
WebRTCライブラリ: GoogleWebRTC バージョン 1.1 (CocoaPodsで導入)
シグナリングサーバー: AWS Kinesis Video Streams
問題が発生するデバイス:
モデル名: iPhone18,1, OS: 26.0
モデル名: iPhone18,1, OS: 26.1
問題が発生しないデバイス:
iPhone17,5 以前の多数のモデル
モデル名: iPhone18,1, OS: 26.0
モデル名: iPhone18,3, OS: 26.0
I’m encountering a consistent crash in WebKit when using WKWebView to play a YouTube playlist in my iOS app. Playback starts successfully, but the web process terminates during the second video in the playlist. This only occurs on physical devices, not in the simulator.
Here’s a simplified Swift example of my setup:
import SwiftUI
import WebKit
struct ContentView: View {
private let playlistID = "PLig2mjpwQBZnghraUKGhCqc9eAy0UbpDN"
var body: some View {
YouTubeWebView(playlistID: playlistID)
.edgesIgnoringSafeArea(.all)
}
}
struct YouTubeWebView: UIViewRepresentable {
let playlistID: String
func makeUIView(context: Context) -> WKWebView {
let config = WKWebViewConfiguration()
config.allowsInlineMediaPlayback = true
let webView = WKWebView(frame: .zero, configuration: config)
webView.scrollView.isScrollEnabled = true
let html = """
<!doctype html>
<html>
<head>
<meta name="viewport" content="initial-scale=1.0, maximum-scale=1.0">
<style>body,html{height:100%;margin:0;background:#000}iframe{width:100%;height:100%;border:0}</style>
</head>
<body>
<iframe
src="https://www.youtube-nocookie.com/embed/videoseries?list=\(playlistID)&controls=1&rel=0&playsinline=1&iv_load_policy=3"
frameborder="0"
allow="encrypted-media; picture-in-picture; fullscreen"
webkit-playsinline
allowfullscreen
></iframe>
</body>
</html>
"""
webView.loadHTMLString(html, baseURL: nil)
return webView
}
func updateUIView(_ uiView: WKWebView, context: Context) {}
}
#Preview {
ContentView()
}
Observed behavior:
First video plays without issue.
Web process crashes when the second video in the playlist starts.
Console logs show WebProcessProxy::didClose and repeated memory status messages.
Using ProcessAssertion or background activity does not prevent the crash.
Only occurs on physical devices; simulators do not reproduce the issue.
Questions:
Is there something I should change or add in my WKWebView setup or HTML/iframe to prevent the crash when playing the second video in a playlist on physical iOS devices?
Is there an officially supported way to limit memory or prevent WebKit from terminating the web process during multi-video playback?
Are there recommended patterns for playing YouTube playlists in a WKWebView on iOS without risking crashes?
Any tips for debugging or configuring WKWebView to make it more stable for continuous playlist playback?
Thanks in advance for any guidance!
Hi everyone,
I’m exploring using the iPhone 17 Pro with the Blackmagic ProDock in a custom capture app. The genlock functionality seems accessible via AVExternalSyncDevice and related APIs, which is great.
I’m specifically curious about external timecode coming in from the ProDock:
• Is there a public way to access the timecode feed in a custom app via AVFoundation or another Apple API?
• If so, what is the recommended approach to read or apply that timecode during capture?
• Are there any current limitations or entitlements required to access timecode from ProDock in a third-party app?
I’m excited to start integrating synchronized capture in my app, and any guidance or sample patterns would be greatly appreciated.
Thanks in advance!
— [Artem]
Topic:
Media Technologies
SubTopic:
Video
Is there limits on the supported dimension for VTLowLatencyFrameInterpolationConfiguration. Querying VTLowLatencyFrameInterpolationConfiguration.maximumDimensions and VTLowLatencyFrameInterpolationConfiguration.minimumDimensions returns nil. When I try the WWDC sample project EnhancingYourAppWithMachineLearningBasedVideoEffects with a 4k video this statement try frameProcessor.startSession(configuration: configuration) executes but try await frameProcessor.process(parameters: parameters) throws error Error Domain=VTFrameProcessorErrorDomain Code=-19730 "Processor is not initialized" UserInfo={NSLocalizedDescription=Processor is not initialized}.
Also, why is VTLowLatencyFrameInterpolationConfiguration able to run while app is backgrounded but VTFrameRateConversionParameters can't (due to gpu usage)?
We're distributing a virtual camera with our app that does not profit in the slightest from automatically applied system video effects both to the video going in (physical camera device) or out (virtual camera device). I'm aware of setting NSCameraReactionEffectGesturesEnabledDefault in Info.plist and determining active video effects via AVCaptureDevice API. Those are obviously crutches, because having to tell users to go look for and click around in menu bar apps is the opposite of a great UX.
To make our product's video output more deterministic, I'm looking for a way to tell the CMIO subsystem that our virtual camera does not support any of the system video effects. I'm seeing properties like
AVCaptureDevice.Format.isPortraitEffectSupported and AVCaptureDevice.Format.isStudioLightSupported whose documentation refers to the format's ability to support these effects. Since we're setting a CMFormatDescription via CMIOExtensionStreamSource.formats I was hoping to find something in the extensions, but wasn't successful so far.
Can this be done?
I believe this should work:
CFMutableDictionaryRef attrs = CFDictionaryCreateMutable(kCFAllocatorDefault, 0, &kCFTypeDictionaryKeyCallBacks, &kCFTypeDictionaryValueCallBacks);
CFDictionaryAddValue(attrs, kCVImageBufferColorPrimariesKey, kCVImageBufferColorPrimaries_ITU_R_709_2);
CFDictionaryAddValue(attrs, kCVImageBufferTransferFunctionKey, kCVImageBufferTransferFunction_ITU_R_709_2);
CFDictionaryAddValue(attrs, kCVImageBufferYCbCrMatrixKey, kCVImageBufferYCbCrMatrix_ITU_R_709_2);
CVPixelBufferRef pixelBuffer = NULL;
CVPixelBufferCreate(kCFAllocatorDefault, width, height, kCVPixelFormatType_32ARGB, attrs, &pixelBuffer);
assert(CFDictionaryGetCount(CVBufferGetAttachments(pixelBuffer, kCVAttachmentMode_ShouldPropagate)) > 0);
But that last assert fails, so it appears the color info does not get attached.
kCVImageBufferColorPrimariesKey and the others are not one of the keys listed under BufferAttributeKeys, but I think they're supposed to be allowed because they're listed by CMVideoFormatDescriptionGetExtensionKeysCommonWithImageBuffers().
I'm hoping that putting the color matrix info in there will control how AVAssetWriter converts the RGB to YCbCr.
Topic:
Media Technologies
SubTopic:
Video
hello, I'm using VideoTololbox VTFrameRateConversionConfiguration to perform frame interpolation: https://developer.apple.com/documentation/videotoolbox/vtframerateconversionconfiguration?language=objc ,when using 640x480 vidoe input, I got error:
Error ! Invalid configuration
[VEEspressoModel] build failure : flow_adaptation_feature_extractor_rev2.espresso.net. Configuration: landscape640x480
[EpsressoModel] Cannot load Net file flow_adaptation_feature_extractor_rev2.espresso.net. Configuration: landscape640x480
Error: failed to create FRCFlowAdaptationFeatureExtractor for usage 8
Failed to switch (0x12c40e140) [usage:8, 1/4 flow:0, adaptation layer:1, twoStage:0, revision:2, flow size (320x240)].
Could not init FlowAdaptation
initFlowAdaptationWithError fail
tried 2048x1080 is ok.
My app currently captures video using an AVCaptureSession set with the AVCaptureSessionPreset1920x1080 preset. However, I'd like to update this behavior, such that video can be recorded at a range of different resolutions.
There isn't a preset aligning to each desired resolution, so I thought I'd instead directly set the AVCaptureDeviceFormat. For any desired resolution, I would find the format that is closest without going under the desired resolution, and then crop it down as a post-processing step.
However, what I've observed is that there can be a range of available formats for a device at each resolution, with various differing settings. Presumably there is logic within AVCaptureSession that selects a reasonable default based on all these different settings, but since I am applying the format directly, I think I don't have a way to make use of that default logic? And it is undocumented?
Does this mean that the only way to select a format is to implement a comparison function that considers all different values of all different properties on AVCaptureDeviceFormat, and then sort the formats according to this comparator?
If so, what if some new property is added to AVCaptureDeviceFormat in the future? The sort would not take this new property into account, and the function might select a format with some new undesired property.
Are there any guarantees about what types for formats will be supported on a device? For example, can I take for granted that a '420v' format will exist at each resolution? If so I could filter the formats down only to those with this setting without risking filtering out all of the supported formats.
I suspect I may be missing something obvious. Any help would be greatly appreciated!
Hi all,
I'm trying to diagnose and resolve an issue with stuttering video playback using the standard AVPlayer. The video in question is a 4K, 39-second file in *.mov format, being played on an iOS device. It's served via a local HTTP server that proxies requests to a backend to fetch and process the content. The project uses end-to-end encrypted storage, which necessitates the proxy for handling data processing. While playback in offline scenarios is smooth, we are encountering issues with smooth playback during streaming. The same video streams smoothly on other platforms using the same connection, so network limitations are not a factor.
On iOS, playback is consistently choppy, with pauses every 1-3 seconds. The video does not appear to buffer adequately for smooth playback.
One particularly curious aspect is the seemingly random pattern of Content-Range requests made by the AVPlayer when streaming the video. Below is an example of the range requests:
Topic:
Media Technologies
SubTopic:
Video
I’ve tried both AVCaptureVideoDataOutputSampleBufferDelegate (captureOutput) and AVCaptureDataOutputSynchronizerDelegate (dataOutputSynchronizer), but the number of depth frames and saved timestamps is significantly lower than the number of frames in the .mp4 file written by AVAssetWriter.
In my code, I save:
Timestamps for each frame to a metadata file
Depth frames to a binary file
Video to an .mp4 file
If I record a 4-second video at 30fps, the .mp4 file correctly plays for 4 seconds, but the number of stored timestamps and depth frames is much lower—around 70 frames instead of the expected 120.
Does anyone know why this mismatch happens?
func dataOutputSynchronizer(_ synchronizer: AVCaptureDataOutputSynchronizer,
didOutput synchronizedDataCollection: AVCaptureSynchronizedDataCollection) {
// Read all outputs
guard let syncedDepthData: AVCaptureSynchronizedDepthData =
synchronizedDataCollection.synchronizedData(for: depthDataOutput) as? AVCaptureSynchronizedDepthData,
let syncedVideoData: AVCaptureSynchronizedSampleBufferData =
synchronizedDataCollection.synchronizedData(for: videoDataOutput) as? AVCaptureSynchronizedSampleBufferData else {
// only work on synced pairs
return
}
if syncedDepthData.depthDataWasDropped || syncedVideoData.sampleBufferWasDropped {
return
}
let depthData = syncedDepthData.depthData
let depthPixelBuffer = depthData.depthDataMap
let sampleBuffer = syncedVideoData.sampleBuffer
guard let videoPixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer),
let formatDescription = CMSampleBufferGetFormatDescription(sampleBuffer) else {
return
}
addToPreviewStream?(CIImage(cvPixelBuffer: videoPixelBuffer))
if !canWrite() {
return
}
// Extract the presentation timestamp (PTS) from the sample buffer
let timestamp = CMSampleBufferGetPresentationTimeStamp(sampleBuffer)
//sessionAtSourceTime is the first buffer we will write to the file
if self.sessionAtSourceTime == nil {
//Make sure we don't start recording until the buffer reaches the correct time (buffer is always behind, this will fix the difference in time)
guard sampleBuffer.presentationTimeStamp >= self.recordFromTime! else { return }
self.sessionAtSourceTime = sampleBuffer.presentationTimeStamp
self.videoWriter!.startSession(atSourceTime: sampleBuffer.presentationTimeStamp)
}
if self.videoWriterInput!.isReadyForMoreMediaData {
self.videoWriterInput!.append(sampleBuffer)
self.videoTimestamps.append(
Timestamp(
frame: videoTimestamps.count,
value: timestamp.value,
timescale: timestamp.timescale
)
)
let ddm = depthData.depthDataMap
depthCapture.addDepthData(pixelBuffer: ddm, timestamp: timestamp)
}
}
I'm working on an application that uses the iPhone camera for scientific purposes - and, as a result would like to receive video in as unprocessed format as possible.
In particular, I'm interested in getting pixel buffers that contain pretty much the bayer data as the sensor sees it - with the minimum processing of color possible.
Currently we configure the AVCaptureDevice to fix the focus and exposure, use a low ISO with no gain and set the white balance gains to 1. AVCaptureVideoDataOutput is using 32BGRA.
What I'd like to do is remove any additional color and brightness processing such that the data is effectively processed with a linear transfer function (i.e. gamma function is 1).
I thought that this might be down to using the AVCaptureDevice activeColorSpace - we currently use P3_D65 for this. But there only seems to be a few choices (e.g. sRGB, HLG_BT2020) all of which I think affect the gamma.
So:
is it possible to control or specify the gamma / transfer function when using CaptureVideoDelegate?
if not, does one of the color space settings have a defined gamma function that I can effectively reverse it from the pixel data without losing too much information?
or is there a better way to capture video-ish speed images (15-30fps) from the camera sensor that skips processing like this?
Many thanks for any suggestions.
I am working on a project for macOS where I am taking an AVCaptureSession's CVPixelBuffer and I need to convert it into a MTLTexture for rendering. On macOS the pixel format is 2vuy, there does not seem to be a clear format conversion while converting to a metal texture. I have been able to convert it to a texture but the color space seems to be off as it is rendering distorted colors with a double image.
I believe 2vuy is a single pane color space and I have tried to account for that, but I am unaware of what is off.
I have attached The CVPixelBuffer and The distorted MTLTexture along with a laundry list of errors.
On iOS my conversions are fine, it is only the macOS 2vuy pixel format that seems to have issues.
My code for the conversion is also attached.
If there are any suggestions or guidance on how to properly convert a 2vuy CVPixelBuffer to a MTLTexture I would greatly appreciate it.
Many Thanks
Conversion_Logs.txt
ConversionCode.swift
I'm using an AVCaptureSession to send video and audio samples to an AVAssetWriter. When I play back the resultant video, sometimes there is a significant lag between the audio compared with the video, so they're just not in sync. But sometimes they are, with the same code.
If I look at the very first presentation time stamps of the buffers being sent to the delegate, via
func captureOutput(_: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection)
I see something like this:
Adding audio samples for pts time 227711.0855328798,
Adding video samples for pts time 227710.778785374
That is, the clock for audio vs video is behind: the first audio sample I receive is at 11.08 something, while the video video sample is earlier in time, at 10.778 something. The times are the presentation time stamps of the buffer, and the outputPresentationTimeStamp is the exact same number.
It feels like "video" vs the "audio" clock are just mismatched.
This doesn't always happen: sometimes they're synced. Sometimes they're not.
Any ideas? The device I'm recording is a webcam, on iPadOS, connected via the usb-c port.
I'm capturing video stream from GoPro camera (I demux UDP MPEG-TS packets) and create CMSampleBuffers from them, this works fine when I display them using CMSampleBufferLayer.
However when I dump them to disk using AVAssetWriter and then playback it with AVPlayer, AVPlayer has problems with scrubbing, it also cannot render previous frames, it needs to go back to key frames. Also thumbnails generated with AVAssetImageGenerator are mostly distorted and green, even though I set the requestedTimeToleranceAfter longer than the key frames frequency.
When I re-encode saved video once again with AVAssetExportSession and play it back then I can scrub the video just fine.
Is it because re-transcoding adds additional metadata to enable generating frames when rewinding the video and scrubbing?
If so is there a way to achieve it with AVAssetWriter without much time penalty? I need the dump/save operation to be very fast.
I also considered the following: Instead of de-muxing video and creating CMSampleBuffers, maybe I could directly dump the stream to disk and somehow add moov atoms with timing information. Would this approach work? If so where I can find information how to do it?
Thank you!
I'm working on an app where a user needs to select a video from their Photos library, and I need to get the original, unmodified HEVC (H.265) data stream to preserve its encoding.
The Problem
I have confirmed that my source videos are HEVC. I can record a new video with my iPhone 15 Pro Max camera set to "High Efficiency," export the "Unmodified Original" from Photos on my Mac, and verify that the codec is MPEG-H Part2/HEVC (H.265).
However, when I select that exact same video in my app using PHPickerViewController, the itemProvider does not list public.hevc as an available type identifier. This forces me to fall back to a generic movie type, which results in the system providing me with a transcoded H.264 version of the video.
Here is the debug output from my app after selecting a known HEVC video:
⚠️ 'public.hevc' not found. Falling back to generic movie type (likely H.264).
What I've Tried
My code explicitly checks for the public.hevc identifier in the registeredTypeIdentifiers array. Since it's not found, my HEVC-specific logic is never triggered.
Here is a minimal version of my PHPickerViewControllerDelegate implementation:
import UniformTypeIdentifiers
// ... inside the Coordinator class ...
func picker(_ picker: PHPickerViewController, didFinishPicking results: [PHPickerResult]) {
picker.dismiss(animated: true)
guard let result = results.first else { return }
let itemProvider = result.itemProvider
let hevcIdentifier = "public.hevc"
let identifiers = itemProvider.registeredTypeIdentifiers
print("Available formats from itemProvider: \(identifiers)")
if identifiers.contains(hevcIdentifier) {
print("✅ HEVC format found, requesting raw data...")
itemProvider.loadDataRepresentation(forTypeIdentifier: hevcIdentifier) { (data, error) in
// ... process H.265 data ...
}
} else {
print("⚠️ 'public.hevc' not found. Falling back to generic movie type (likely H.264).")
itemProvider.loadFileRepresentation(forTypeIdentifier: UTType.movie.identifier) { url, error in
// ... process H.264 fallback ...
}
}
}
My Environment
Device: iPhone 15 Pro Max
iOS Version: iOS 18.5
Xcode Version: 16.2
My Questions
Are there specific conditions (e.g., the video being HDR/Dolby Vision, Cinematic, or stored in iCloud) under which PHPickerViewController's itemProvider would intentionally not offer the public.hevc type identifier, even for an HEVC video?
What is the definitive, recommended API sequence to guarantee that I receive the original, unmodified data stream for a video asset, ensuring that no transcoding to H.264 occurs during the process?
Any insight into why public.hevc might be missing from the registeredTypeIdentifiers for a known HEVC asset would be greatly appreciated. Thank you.