I'm working on an application that uses the iPhone camera for scientific purposes - and, as a result would like to receive video in as unprocessed format as possible.
In particular, I'm interested in getting pixel buffers that contain pretty much the bayer data as the sensor sees it - with the minimum processing of color possible.
Currently we configure the AVCaptureDevice to fix the focus and exposure, use a low ISO with no gain and set the white balance gains to 1. AVCaptureVideoDataOutput is using 32BGRA.
What I'd like to do is remove any additional color and brightness processing such that the data is effectively processed with a linear transfer function (i.e. gamma function is 1).
I thought that this might be down to using the AVCaptureDevice activeColorSpace - we currently use P3_D65 for this. But there only seems to be a few choices (e.g. sRGB, HLG_BT2020) all of which I think affect the gamma.
So:
is it possible to control or specify the gamma / transfer function when using CaptureVideoDelegate?
if not, does one of the color space settings have a defined gamma function that I can effectively reverse it from the pixel data without losing too much information?
or is there a better way to capture video-ish speed images (15-30fps) from the camera sensor that skips processing like this?
Many thanks for any suggestions.
Video
RSS for tagDive into the world of video on Apple platforms, exploring ways to integrate video functionalities within your iOS,iPadOS, macOS, tvOS, visionOS or watchOS app.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
I am working on a project for macOS where I am taking an AVCaptureSession's CVPixelBuffer and I need to convert it into a MTLTexture for rendering. On macOS the pixel format is 2vuy, there does not seem to be a clear format conversion while converting to a metal texture. I have been able to convert it to a texture but the color space seems to be off as it is rendering distorted colors with a double image.
I believe 2vuy is a single pane color space and I have tried to account for that, but I am unaware of what is off.
I have attached The CVPixelBuffer and The distorted MTLTexture along with a laundry list of errors.
On iOS my conversions are fine, it is only the macOS 2vuy pixel format that seems to have issues.
My code for the conversion is also attached.
If there are any suggestions or guidance on how to properly convert a 2vuy CVPixelBuffer to a MTLTexture I would greatly appreciate it.
Many Thanks
Conversion_Logs.txt
ConversionCode.swift
Hi all,
I'm trying to diagnose and resolve an issue with stuttering video playback using the standard AVPlayer. The video in question is a 4K, 39-second file in *.mov format, being played on an iOS device. It's served via a local HTTP server that proxies requests to a backend to fetch and process the content. The project uses end-to-end encrypted storage, which necessitates the proxy for handling data processing. While playback in offline scenarios is smooth, we are encountering issues with smooth playback during streaming. The same video streams smoothly on other platforms using the same connection, so network limitations are not a factor.
On iOS, playback is consistently choppy, with pauses every 1-3 seconds. The video does not appear to buffer adequately for smooth playback.
One particularly curious aspect is the seemingly random pattern of Content-Range requests made by the AVPlayer when streaming the video. Below is an example of the range requests:
Topic:
Media Technologies
SubTopic:
Video
I'm using an AVCaptureSession to send video and audio samples to an AVAssetWriter. When I play back the resultant video, sometimes there is a significant lag between the audio compared with the video, so they're just not in sync. But sometimes they are, with the same code.
If I look at the very first presentation time stamps of the buffers being sent to the delegate, via
func captureOutput(_: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection)
I see something like this:
Adding audio samples for pts time 227711.0855328798,
Adding video samples for pts time 227710.778785374
That is, the clock for audio vs video is behind: the first audio sample I receive is at 11.08 something, while the video video sample is earlier in time, at 10.778 something. The times are the presentation time stamps of the buffer, and the outputPresentationTimeStamp is the exact same number.
It feels like "video" vs the "audio" clock are just mismatched.
This doesn't always happen: sometimes they're synced. Sometimes they're not.
Any ideas? The device I'm recording is a webcam, on iPadOS, connected via the usb-c port.
I'm capturing video stream from GoPro camera (I demux UDP MPEG-TS packets) and create CMSampleBuffers from them, this works fine when I display them using CMSampleBufferLayer.
However when I dump them to disk using AVAssetWriter and then playback it with AVPlayer, AVPlayer has problems with scrubbing, it also cannot render previous frames, it needs to go back to key frames. Also thumbnails generated with AVAssetImageGenerator are mostly distorted and green, even though I set the requestedTimeToleranceAfter longer than the key frames frequency.
When I re-encode saved video once again with AVAssetExportSession and play it back then I can scrub the video just fine.
Is it because re-transcoding adds additional metadata to enable generating frames when rewinding the video and scrubbing?
If so is there a way to achieve it with AVAssetWriter without much time penalty? I need the dump/save operation to be very fast.
I also considered the following: Instead of de-muxing video and creating CMSampleBuffers, maybe I could directly dump the stream to disk and somehow add moov atoms with timing information. Would this approach work? If so where I can find information how to do it?
Thank you!
I'm working on an app where a user needs to select a video from their Photos library, and I need to get the original, unmodified HEVC (H.265) data stream to preserve its encoding.
The Problem
I have confirmed that my source videos are HEVC. I can record a new video with my iPhone 15 Pro Max camera set to "High Efficiency," export the "Unmodified Original" from Photos on my Mac, and verify that the codec is MPEG-H Part2/HEVC (H.265).
However, when I select that exact same video in my app using PHPickerViewController, the itemProvider does not list public.hevc as an available type identifier. This forces me to fall back to a generic movie type, which results in the system providing me with a transcoded H.264 version of the video.
Here is the debug output from my app after selecting a known HEVC video:
⚠️ 'public.hevc' not found. Falling back to generic movie type (likely H.264).
What I've Tried
My code explicitly checks for the public.hevc identifier in the registeredTypeIdentifiers array. Since it's not found, my HEVC-specific logic is never triggered.
Here is a minimal version of my PHPickerViewControllerDelegate implementation:
import UniformTypeIdentifiers
// ... inside the Coordinator class ...
func picker(_ picker: PHPickerViewController, didFinishPicking results: [PHPickerResult]) {
picker.dismiss(animated: true)
guard let result = results.first else { return }
let itemProvider = result.itemProvider
let hevcIdentifier = "public.hevc"
let identifiers = itemProvider.registeredTypeIdentifiers
print("Available formats from itemProvider: \(identifiers)")
if identifiers.contains(hevcIdentifier) {
print("✅ HEVC format found, requesting raw data...")
itemProvider.loadDataRepresentation(forTypeIdentifier: hevcIdentifier) { (data, error) in
// ... process H.265 data ...
}
} else {
print("⚠️ 'public.hevc' not found. Falling back to generic movie type (likely H.264).")
itemProvider.loadFileRepresentation(forTypeIdentifier: UTType.movie.identifier) { url, error in
// ... process H.264 fallback ...
}
}
}
My Environment
Device: iPhone 15 Pro Max
iOS Version: iOS 18.5
Xcode Version: 16.2
My Questions
Are there specific conditions (e.g., the video being HDR/Dolby Vision, Cinematic, or stored in iCloud) under which PHPickerViewController's itemProvider would intentionally not offer the public.hevc type identifier, even for an HEVC video?
What is the definitive, recommended API sequence to guarantee that I receive the original, unmodified data stream for a video asset, ensuring that no transcoding to H.264 occurs during the process?
Any insight into why public.hevc might be missing from the registeredTypeIdentifiers for a known HEVC asset would be greatly appreciated. Thank you.
Because I want to control the grid size and number of HEIC images myself, I decided to perform HEVC encoding manually and then generate the HEIC image. Previously, I used VTCompressionSession to accomplish this task, and the results were satisfactory. It worked perfectly on iOS 16 through iOS 18 — in other words, it was able to generate correct HEVC encoding, and its CMFormatDescription should also have been correct, since I relied on it to generate the decoderConfig; otherwise, the final image would have decoding issues.
However, it can no longer generate a valid HEIC image on a physical device running iOS 26. Interestingly, it still works fine on the iOS 26 simulator — it only fails on real hardware. The abnormal result is that the image becomes completely black, although the image dimensions are still correct.
After my troubleshooting, I suspect that the encoding behavior of VTCompressionSession has been modified on iOS 26, which causes the final hvc1 encoding I pass in to be incorrect.
I created a VTCompressionSession using the following configuration.
var newSession: VTCompressionSession!
var status = VTCompressionSessionCreate(
allocator: kCFAllocatorDefault,
width: Int32(frameSize.width),
height: Int32(frameSize.height),
codecType: kCMVideoCodecType_HEVC,
encoderSpecification: nil,
imageBufferAttributes: nil,
compressedDataAllocator: nil,
outputCallback: nil,
refcon: nil,
compressionSessionOut: &newSession
)
try check(status, VideoToolboxErrorDomain)
let properties: [CFString: Any] = [
kVTCompressionPropertyKey_AllowFrameReordering: false,
kVTCompressionPropertyKey_AllowTemporalCompression: false,
kVTCompressionPropertyKey_RealTime: false,
kVTCompressionPropertyKey_MaximizePowerEfficiency: false,
kVTCompressionPropertyKey_ProfileLevel: profileLevel,
kVTCompressionPropertyKey_Quality: quality.rawValue,
]
status = VTSessionSetProperties(newSession, propertyDictionary: properties as CFDictionary)
try check(status, VideoToolboxErrorDomain) {
VTCompressionSessionInvalidate(newSession)
}
Then use the following code to encode each Grid of the image.
let status = VTCompressionSessionEncodeFrame(
session,
imageBuffer: buffer,
presentationTimeStamp: presentationTimeStamp,
duration: frameDuration,
frameProperties: nil,
infoFlagsOut: nil) { [weak self] status, _, sampleBuffer in
try check(status, VideoToolboxErrorDomain)
if let sampleBuffer {
let encodedImage = try self.encodedImage(from: sampleBuffer)
// handle encodedImage
}
}
try check(status, VideoToolboxErrorDomain)
If I try to display this abnormal image in the App, my console outputs the following error, so it can be inferred that the issue probably occurred during decoding.
createImageBlock:3029: *** ERROR: CGImageBlockCreate {0, 0, 2316, 6176} - data is NULL
callDecodeImage:2411: *** ERROR: decodeImageImp failed - NULL _blockArray
createImageBlock:3029: *** ERROR: CGImageBlockCreate {0, 0, 2316, 6176} - data is NULL
callDecodeImage:2411: *** ERROR: decodeImageImp failed - NULL _blockArray
createImageBlock:3029: *** ERROR: CGImageBlockCreate {0, 0, 2316, 6176} - data is NULL
callDecodeImage:2411: *** ERROR: decodeImageImp failed - NULL _blockArray
It needs to be emphasized again that this code used to work fine in the past, and the issue only occurs on an iOS 26 physical device. I noticed that iOS 26 has introduced many new properties, but I’m not sure whether some of these new properties must be set in the new system, and there’s no information about this in the official documentation.
I am developing an iOS app that uses YOLOv8 for object detection and aims to detect objects at 60 FPS using the UltraWide camera. My goal is to process every frame within captureOutput and utilize the detected data (such as coordinates) for each one.
I have a question regarding how background thread processing behaves in this scenario. Does the size of the YOLO model (n, s, m, etc.) or the weight of the operations inside captureOutput affect the number of frames that can be successfully processed?
Specifically, I would like to know if all frames will be processed sequentially with a delay due to heavy processing in the background, or if some frames will be dropped and not processed at all. Any insights on how to handle this would be greatly appreciated.
Thank you!
I have generated FCPXML, but i can't figure out issue:
<?xml version="1.0"?>
<fcpxml version="1.11">
<resources>
<format id="r1" name="FFVideoFormat3840x2160p2997" frameDuration="1001/30000s" width="3840" height="2160" colorSpace="1-1-1 (Rec. 709)"/>
<asset id="video0" name="11a(1-5).mp4" start="0s" hasVideo="1" videoSources="1" duration="6.81s">
<media-rep kind="original-media" src="file:///Volumes/Dropbox/RealMedia Dropbox/Real Media/Media/Test/Test AE videos, City, testOLOLO/video/11a(1-5).mp4"/>
</asset>
<asset id="video1" name="12(4)r8 mute.mp4" start="0s" hasVideo="1" videoSources="1" duration="9.94s">
<media-rep kind="original-media" src="file:///Volumes/Dropbox/RealMedia Dropbox/Real Media/Media/Test/Test AE videos, City, testOLOLO/video/12(4)r8 mute.mp4"/>
</asset>
<asset id="video2" name="13 mute.mp4" start="0s" hasVideo="1" videoSources="1" duration="6.51s">
<media-rep kind="original-media" src="file:///Volumes/Dropbox/RealMedia Dropbox/Real Media/Media/Test/Test AE videos, City, testOLOLO/video/13 mute.mp4"/>
</asset>
<asset id="video3" name="13x (8,14,24,29,38).mp4" start="0s" hasVideo="1" videoSources="1" duration="45.55s">
<media-rep kind="original-media" src="file:///Volumes/Dropbox/RealMedia Dropbox/Real Media/Media/Test/Test AE videos, City, testOLOLO/video/13x (8,14,24,29,38).mp4"/>
</asset>
</resources>
<library>
<event name="Untitled">
<project name="Untitled Project" uid="28B2D4F3-05C4-44E7-8D0B-70A326135EDD" modDate="2024-04-17 15:44:26 -0400">
<sequence format="r1" duration="4802798/30000s" tcStart="0s" tcFormat="NDF" audioLayout="stereo" audioRate="48k">
<spine>
<asset-clip ref="video0" offset="0/10000s" name="11a(1-5).mp4" duration="0/10000s" format="r1" tcFormat="NDF"/>
<asset-clip ref="video1" offset="12119/10000s" name="12(4)r8 mute.mp4" duration="0/10000s" format="r1" tcFormat="NDF"/>
<asset-clip ref="video2" offset="22784/10000s" name="13 mute.mp4" duration="0/10000s" format="r1" tcFormat="NDF"/>
<asset-clip ref="video3" offset="34544/10000s" name="13x (8,14,24,29,38).mp4" duration="0/10000s" format="r1" tcFormat="NDF"/>
</spine>
</sequence>
</project>
</event>
</library>
</fcpxml>
Any ideas?
Hi,
trying to wrap my head around Xcode's FXPlug. I already sell Final Cut Pro titles for a company. These titles were built in motion.
However, they want me to move them to an app and I'm looking for any help on how to accomplish this
*What the app should do is:
Allow users with an active subscription to our website the ability to access titles within FCPX and if they are not an active subscriber, for access to be denied.
Topic:
Media Technologies
SubTopic:
Video
Tags:
Professional Video Applications
MetalFX
wwdc2022-10103
I think I have the simplest possible Mac app trying to see if I can have VideoPlayer work in an Xcode Preview. It works in an iOS app project. In a Mac app project it builds and runs. But if I preview in Xcode it crashes.
The diagnostic says:
| [Remote] Unknown Error: The operation couldn’t be completed. XPC error received on message reply handler
|
| BSServiceConnectionErrorDomain (3):
| ==NSLocalizedFailureReason: XPC error received on message reply handler
| ==BSErrorCodeDescription: OperationFailed
The code I'm using is the exact code from the VideoPlayer documentation page. See this link.
Any ideas about this XPC error, and how to work around?
I'm using Xcode 16.0 on macOS 14.6.1
The media services used for HLS streaming in an AVPlayer seem to crash if your segments are too large.
Anything over 20Mbps seems to cause a crash. I have tried adjusting the segment length to 1 second also and it didn't help.
I am remuxing Dolby Vision and HDR video and want to avoid transcoding and losing any metadata. However the segments are too large.
Is there a workaround for this? Otherwise it seems AVFoundation is not suited to high bitrate HLS and I should be using MPV or similar.
We have had the same video player in our app for at least 5 years with few issues but the iOS 18 updated has now resulted in video playback for our users who have downloaded the video for offline viewing is now played at 2x speed.
Hi, Im working on a app with a infinite scrollable video similar to Tiktok or instagram reels. I initially thought it would be a good idea to cache videos in the file system but after reading this post it seems like it is not recommended to cache videos on the file system: https://forums.developer.apple.com/forums/thread/649810#:~:text=If%20the%20videos%20can%20be%20reasonably%20cached%20in%20RAM%20then%20we%20would%20recommend%20that.%20Regularly%20caching%20video%20to%20disk%20contributes%20to%20NAND%20wear
The reason I am hesitant to cache videos to memory is because this will add up pretty quickly and increase memory pressure for my app.
After seeing the amount of documents and data storage that instagram stores, its obvious they are caching videos on the file system. So I was wondering what is the updated best practice for caching for these kind of apps?
when I played a local video(I downloaded it to the sandbox),KVO the AVPlayerItem status is AVPlayerItemStatusFailed and error is Error Domain=AVFoundationErrorDomain Code=-11800 "这项操作无法完成" UserInfo={NSLocalizedFailureReason=发生未知错误(24), NSLocalizedDescription=这项操作无法完成, NSUnderlyingError=0x3004137e0 {Error Domain=NSPOSIXErrorDomain Code=24 "Too many open files"}}
why?
Topic:
Media Technologies
SubTopic:
Video
I am creating an AVComposition and using it with an AVPlayer. The player works fine and doesn't consume much memory when I do not set playerItem.videoComposition. Here is the code that works without excessive memory usage:
func configurePlayer(composition: AVMutableComposition, videoComposition: AVVideoComposition) {
player.pause()
player.replaceCurrentItem(with: nil)
let playerItem = AVPlayerItem(asset: composition)
player.play()
}
However, when I add playerItem.videoComposition = videoComposition, as in the code below, the memory usage becomes excessive:
func configurePlayer(composition: AVMutableComposition, videoComposition: AVVideoComposition) {
player.pause()
player.replaceCurrentItem(with: nil)
let playerItem = AVPlayerItem(asset: composition)
playerItem.videoComposition = videoComposition
player.play()
}
Issue Details:
The memory usage seems to depend on the number of video tracks in the composition, rather than their duration. For instance, two videos of 30 minutes each consume less memory than 40 videos of just 2 seconds each.
The excessive memory usage is showing up in the Other Processes section of Xcode's debug panel.
For reference, 42 videos, each less than 30 seconds, are using around 1.4 GB of memory.
I'm struggling to understand why adding videoComposition causes such high memory consumption, especially since it happens even when no layer instructions are applied. Any insights on how to address this would be greatly appreciated. Before After
I initially thought the problem might be due to having too many layer instructions in the video composition, but this doesn't seem to be the case. Even when I set a videoComposition without any layer instructions, the memory consumption remains high.
Safari is supposed to support animated AVIF images since version 16, but the ones I've tested perform very poorly, even on an M4 Mac Mini running Sequoia 15.1.1.
I believe Safari delegates decoding to the operating system itself, so this issue also happens in Live Preview in the finder, when I try to preview a file.
Sample file here: https://s3.us-west-2.amazonaws.com/cdn.paintera.org/test/sample.avif
322KB file, 5 seconds long, 12fps
This plays perfectly on Chrome on Mac OS, but is slow and laggy on Safari and Live Preview (it takes about 6.5 seconds to finish the 5 second video).
Does anyone know how to fix this or workaround this issue?
Topic:
Media Technologies
SubTopic:
Video
Hello there,
I need to move through video loaded in an AVPlayer one frame at a time back or forth. For that I tried to use AVPlayerItem's method step(byCount:) and it works just fine.
However I need to know when stepping happened and as far as I observed it is not immediate using the method. If I check the currentTime() just after calling the method it's the same and if I do it slightly later (depending of the video itself) it shows the correct "jumped" time.
To achieve my goal I tried subclassing AVPlayerItem and implement my own async method utilizing NotificationCenter and the timeJumpedNotification assuming it would deliver it as the time actually jumps but it's not the case.
Here is my "stripped" and simplified version of the custom Player Item:
import AVFoundation
final class PlayerItem: AVPlayerItem {
private var jumpCompletion: ( (CMTime) -> () )?
override init(asset: AVAsset, automaticallyLoadedAssetKeys: [String]?) {
super .init(asset: asset, automaticallyLoadedAssetKeys: automaticallyLoadedAssetKeys)
NotificationCenter.default.addObserver(self, selector: #selector(timeDidChange(_:)), name: AVPlayerItem.timeJumpedNotification, object: self)
}
deinit {
NotificationCenter.default.removeObserver(self, name: AVPlayerItem.timeJumpedNotification, object: self)
jumpCompletion = nil
}
@discardableResult func step(by count: Int) async -> CMTime {
await withCheckedContinuation { continuation in
step(by: count) { time in
continuation.resume(returning: time)
}
}
}
func step(by count: Int, completion: @escaping ( (CMTime) -> () )) {
guard jumpCompletion == nil else {
completion(currentTime())
return
}
jumpCompletion = completion
step(byCount: count)
}
@objc private func timeDidChange(_ notification: Notification) {
switch notification.name {
case AVPlayerItem.timeJumpedNotification where notification.object as? AVPlayerItem [==](https://www.example.com/) self:
jumpCompletion?(currentTime())
jumpCompletion = nil
default: return
}
}
}
In short the notification never gets called thus the above is not working.
I guess the key there is that in the docs about the timeJumpedNotification: is said:
"A notification the system posts when a player item’s time changes discontinuously."
so the step(byCount:) is not considered as discontinuous operation and doesn't trigger it.
I'd be really helpful if somebody can help as I don't want to use seek(to:toleranceBefore:toleranceAfter:) mainly cause it's not accurate in terms of the exact next/previous frame as the video might have VFR and that causes repeating frames sometimes or even skipping one or another.
Thanks a lot
Hello, I am trying to get the new iPhone 16 pro to achieve 4k 120fps encoding when we are getting the video feed from the default, wide angle camera on the back. We are using the apple API to capture the individual frames from the camera as they are processed and we get them in this callback:
// this is the main callback function to handle video frames captured
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
We are then taking these frames as they come in and encoding them using VideoToolBox. After they are encoded, they are added to a ring buffer so we can access them after they have been encoded.
The problem is that when we are encoding these frames on an iPhone 16 Pro, we are only reaching 80-90fps instead of 120fps. We have removed as much processing as we can. We get some small attributes about the frame when it comes in, encode the frame, and then add it to our ring buffer.
I have attached a sample project that is broken down as much as possible to the basic task of encoding 4k 120fps footage. Inside the sample app, there is an fps and pps display showing how many frames we are encoding per second. FPS represents how many frames we are coming in per second from the camera, and PPS represents how many frames we are processing (encoding) per second.
Link to sample project: https://github.com/jake-fishtech/EncoderPerformance
Thanks you for any help or suggestions.
Capturing more than one display is no longer working with macOS Sequoia.
We have a product that allows users to capture up to 2 displays/screens. Our application is using gstreamer which in turn is based on AVFoundation.
I found a quick way to replicate the issue by just running 2 captures from separate terminals. Assuming display 1 has device index 0, and display 2 has device index 1, here are the steps:
install gstreamer with
brew install gstreamer
Then open 2 terminal windows and launch the following processes:
terminal 1 (device-index:0):
gst-launch-1.0 avfvideosrc -e device-index=0 capture-screen=true ! queue ! videoscale ! video/x-raw,width=640,height=360 ! videoconvert ! osxvideosink
terminal 2 (device-index:1):
gst-launch-1.0 avfvideosrc -e device-index=1 capture-screen=true ! queue ! videoscale ! video/x-raw,width=640,height=360 ! videoconvert ! osxvideosink
The first process that is launched will show the screen, the second process launched will not.
Testing this on macOS Ventura and Sonoma works as expected, showing both screens.
I submitted the same issue on Feedback Assistant: FB15900976