VideoToolbox

RSS for tag

Work directly with hardware-accelerated video encoding and decoding capabilities using VideoToolbox.

Posts under VideoToolbox tag

122 Posts

Post

Replies

Boosts

Views

Activity

Reducing storage of similar PNGs by compressing them into a video and retrieving them losslessly--possibility or dumb idea?
My app stores and transports lots of groups of similar PNGs. These aren't compressed well by official algorithms like .lzfse, .lz4, .lzbitmap... not even bz2, but I realized that they are well-suited for compression by video codecs since they're highly similar to one another. I ran an experiment where I compressed a dozen images into an HEVCWithAlpha .mov via AVAssetWriter, and the compression ratio was fantastic, but when I retrieved the PNGs via AVAssetImageGenerator there were lots of artifacts which simply wasn't acceptable. Maybe I'm doing something wrong, or maybe I'm chasing something that doesn't exist. Is there a way to use video compression like a specialized archive to store and retrieve PNGs losslessly while retaining alpha? I have no intention of using the videos except as condensed storage. Any suggestions on how to reduce storage size of many large PNGs are also welcome. I also tried using HEVC instead of PNG via the new UIImage.hevcData(), but the decompression/processing times were just insane (5000%+ increase), on top of there being fatal errors when using async.
18
0
2.2k
Jun ’24
ProRes 4444 blocky compression artifacts
I’m creating a objective C command-line utility to encode RAW image sequences to ProRes 4444, but I’m encountering, blocky compression artifacts in the ProRes 4444 video output. To test the integrity of the image data before encoding to ProRes, I added a snippet in my encoding function that saves a 16-bit PNG before encoding to ProRes and the PNG looks perfect, I can see all detail in every part of the image dynamic range. Here’s a comparison between the 16-bit PNG(on the right) and the ProRes 4444 output. (on the left) As a further test, I re-encoded the ‘test PNG’ to ProRes 4444 using DaVinci Resolve, and the ProRes4444 output video from Resolve doesn’t have any blocky compression artifacts. Looks identical. In short, this is what the utility does: Unpacks the 12-bit raw data into 16-bit values. After unpacking, the raw data is debayered to convert it into a standard color image format (BGR) using OpenCV. Scale the debayered pixel values from their original 12-bit depth to fit into a 16-bit range. Up to this point everything is fine and confirmed by saving 16bit PNGs. The images are encoded to ProRes 4444 using the AVFoundation framework. The pixel buffers are created and managed using dictionary method with ‘kCVPixelFormatType_64RGBALE’. I need help figuring this out, I’m a real novice when it comes to AVfoundation/encoding to ProRes. See relevant parts of my 'encodeToProRes' function: void encodeToProRes(const std::string &outputPath, const std::vector<std::string> &rawPaths, const std::string &proResFlavor) { NSError *error = nil; NSURL *url = [NSURL fileURLWithPath:[NSString stringWithUTF8String:outputPath.c_str()]]; AVAssetWriter *assetWriter = [AVAssetWriter assetWriterWithURL:url fileType:AVFileTypeQuickTimeMovie error:&error]; if (error) { std::cerr << "Error creating AVAssetWriter: " << error.localizedDescription.UTF8String << std::endl; return; } // Load the first image to get the dimensions std::cout << "Debayering the first image to get dimensions..." << std::endl; Mat firstImage; int width = 5320; int height = 3900; if (!debayer_image(rawPaths[0], firstImage, width, height)) { std::cerr << "Error debayering the first image" << std::endl; return; } width = firstImage.cols; height = firstImage.rows; // Save the first frame as a PNG 16-bit image for validation std::string pngFilePath = outputPath + "_frame1.png"; if (!imwrite(pngFilePath, firstImage)) { std::cerr << "Error: Failed to save the first frame as a PNG image" << std::endl; } else { std::cout << "First frame saved as PNG: " << pngFilePath << std::endl; } NSString *codecKey = nil; if (proResFlavor == "4444") { codecKey = AVVideoCodecTypeAppleProRes4444; } else if (proResFlavor == "422HQ") { codecKey = AVVideoCodecTypeAppleProRes422HQ; } else if (proResFlavor == "422") { codecKey = AVVideoCodecTypeAppleProRes422; } else if (proResFlavor == "LT") { codecKey = AVVideoCodecTypeAppleProRes422LT; } else { std::cerr << "Error: Invalid ProRes flavor specified: " << proResFlavor << std::endl; return; } NSDictionary *outputSettings = @{ AVVideoCodecKey: codecKey, AVVideoWidthKey: @(width), AVVideoHeightKey: @(height) }; AVAssetWriterInput *videoInput = [AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeVideo outputSettings:outputSettings]; videoInput.expectsMediaDataInRealTime = YES; NSDictionary *pixelBufferAttributes = @{ (id)kCVPixelBufferPixelFormatTypeKey: @(kCVPixelFormatType_64RGBALE), (id)kCVPixelBufferWidthKey: @(width), (id)kCVPixelBufferHeightKey: @(height) }; AVAssetWriterInputPixelBufferAdaptor *adaptor = [AVAssetWriterInputPixelBufferAdaptor assetWriterInputPixelBufferAdaptorWithAssetWriterInput:videoInput sourcePixelBufferAttributes:pixelBufferAttributes]; ... [assetWriter startSessionAtSourceTime:kCMTimeZero]; CMTime frameDuration = CMTimeMake(1, 24); // Frame rate of 24 fps int numFrames = static_cast<int>(rawPaths.size()); ... // Encoding thread std::thread encoderThread([&]() { int frameIndex = 0; std::vector<CVPixelBufferRef> pixelBufferBuffer; while (frameIndex < numFrames) { std::unique_lock<std::mutex> lock(queueMutex); queueCondVar.wait(lock, [&]() { return !frameQueue.empty() || debayeringFinished; }); if (!frameQueue.empty()) { auto [index, debayeredImage] = frameQueue.front(); frameQueue.pop(); lock.unlock(); if (index == frameIndex) { cv::Mat rgbaImage; cv::cvtColor(debayeredImage, rgbaImage, cv::COLOR_BGR2RGBA); CVPixelBufferRef pixelBuffer = NULL; CVReturn result = CVPixelBufferPoolCreatePixelBuffer(NULL, adaptor.pixelBufferPool, &pixelBuffer); if (result != kCVReturnSuccess) { std::cerr << "Error: Could not create pixel buffer" << std::endl; dispatch_group_leave(dispatchGroup); return; } CVPixelBufferLockBaseAddress(pixelBuffer, 0); void *pxdata = CVPixelBufferGetBaseAddress(pixelBuffer); for (int row = 0; row < height; ++row) { memcpy(static_cast<uint8_t*>(pxdata) + row * CVPixelBufferGetBytesPerRow(pixelBuffer), rgbaImage.ptr(row), width * 8); } CVPixelBufferUnlockBaseAddress(pixelBuffer, 0); pixelBufferBuffer.push_back(pixelBuffer); ... Thanks very much!
1
0
1.3k
Jun ’24
VideoToolBox on OSX15.0
Hello, Notice that when using h.265 VideoToolBox encoding with Handbrake (latest snapshots) the resulting output is larger than the 264 source. Obviously this was the case for OSX 14.x and below. Is this a known "regression" ? Thanks.
1
0
835
Jun ’24
iOS to Android H264 encoding issue.
I'm trying to cast the screen from an iOS device to an Android device. I'm leveraging ReplayKit on iOS to capture the screen and VideoToolbox for compressing the captured video data into H.264 format using CMSampleBuffers. Both iOS and Android are configured for H.264 compression and decompression. While screen casting works flawlessly within the same platform (iOS to iOS or Android to Android), I'm encountering an error ("not in avi mode") on the Android receiver when casting from iOS. My research suggests that the underlying container formats for H.264 might differ between iOS and Android. Data transmission over the TCP socket seems to be functioning correctly. My question is: Is there a way to ensure a common container format for H.264 compression and decompression across iOS and Android platforms? Here's a breakdown of the iOS sender details: Device: iPhone 13 mini running iOS 17 Development Environment: Xcode 15 with a minimum deployment target of iOS 16 Screen Capture: ReplayKit for capturing the screen and obtaining CMSampleBuffers Video Compression: VideoToolbox for H.264 compression Compression Properties: kVTCompressionPropertyKey_ConstantBitRate: 6144000 (bitrate) kVTCompressionPropertyKey_ProfileLevel: kVTProfileLevel_H264_Main_AutoLevel (profile and level) kVTCompressionPropertyKey_MaxKeyFrameInterval: 60 (maximum keyframe interval) kVTCompressionPropertyKey_RealTime: true (real-time encoding) kVTCompressionPropertyKey_Quality: 1 (lowest quality) NAL Unit Handling: Custom header is added to NAL units Android Receiver Details: Device: RedMi 7A running Android 10 Video Decoding: MediaCodec API for receiving and decoding the H.264 stream
0
0
1.1k
May ’24
Question regarding the kVTVideoEncoderList_IsHardwareAccelerated flag
I am a bit confused on whether certain Video Toolbox (VT) encoders support hardware acceleration or not. When I query the list of VT encoders (VTCopyVideoEncoderList(nil,&encoderList)) on an iPhone 14 Pro device, for avc1 (AVC / H.264) and hevc1 (HEVC / H.265) encoders, the kVTVideoEncoderList_IsHardwareAccelerated flag is not there, which -based on the documentation found on the VTVideoEncoderList.h- means that the encoders do not support hardware acceleration: optional. CFBoolean. If present and set to kCFBooleanTrue, indicates that the encoder is hardware accelerated. In fact, no encoders from this list return this flag as true and most of them do not include the flag at all on their dictionaries. On the other hand, when I create a compression session using the VTCompressionSessionCreate() and pass the kVTVideoEncoderSpecification_EnableHardwareAcceleratedVideoEncoder as true in the encoder specifications, after querying the kVTCompressionPropertyKey_UsingHardwareAcceleratedVideoEncoder using the following code, I get a CFBoolean value of true for both H.264 and H.265 encoder. In fact, I get a true value (for both of the aforementioned encoders) even if I don't specify the kVTVideoEncoderSpecification_EnableHardwareAcceleratedVideoEncoder during the creation of the compression session (note here that this flag was introduced in iOS 17.4 ^1). So the question is: Are those encoders actually hardware accelerated on my device, and if so, why isn't that reflected on the VTCopyVideoEncoderList() call?
3
1
1.2k
May ’24
iOS App crashes while converting CVPixelbuffer to JPG
I have been seeing some crash reports for my app on some devices (not all of them). The crash occurs while converting a CVPixelBuffer captured from Video to a JPG using VTCreateCGImageFromCVPixelBuffer from VideoToolBox. I have not been able to reproduce the crash on local devices, even under adverse memory conditions (many apps running in the background). The field crash reports show that VTCreateCGImageFromCVPixelBuffer does the conversion in another thread and that thread crashed at call to vConvert_420Yp8_CbCr8ToARGB8888_vec. Any suggestions on how to debug this further would be helpful.
2
0
1.3k
May ’24
Does CVE-2024-1580 affect my app?
I have an image viewing app with support for avif (and avis) images. I'm trying to figure out if the recent bug in CoreMedia (dav1d) affects my app. The apple security update: https://support.apple.com/en-gb/HT214097 The vulnerable code path in dav1d is only reached when c->n_fc > 1 (https://code.videolan.org/videolan/dav1d/-/blob/2b475307dc11be9a1c3cc4358102c76a7f386a51/src/decode.c#L2845), where c is the dav1d context. With some reverse engineering, the way I see CMPhoto calling into VideoToolBox (which internally calls into AV1SW.videodecoder, which is a wrapper around dav1d), the max frame delay is hardcoded to 1 in the dav1d settings which intern means that c->n_fc in dav1d is always 1. The vulnerable code path in dav1d is only reached when c->n_fc > 1 (https://code.videolan.org/videolan/dav1d/-/blob/2b475307dc11be9a1c3cc4358102c76a7f386a51/src/decode.c#L2845). From my understand, this should mean that my app isn't affected. The apple security update however clearly mentions that "Processing an image may lead to arbitrary code execution". Surely I'm missing something?
0
0
927
Apr ’24
Sometimes when video is encoded using the Video Toolbox Encoder for web live streaming, the decoder output always has a 4 frame latency
I am Using VideoToolbox VTCompressionSession For Encoding The Frame in H264 Format Which I will send through web socket to a browser. The Received Frames Will be Decoded and Output Will be rendered in the Website. Now, when using Some encoders the video is rendered always with four frame latency. How Frame is sent to server : start>------------ f1 ------------ f2 ------------ f3 ------------ f4 ------------- f5 ... How rendering is happening : start>-------------------------------------------------------------------------- f1 ------------ f2 ------------ f3 ------------ f4 ----------- ... This Sometime becomes two frame latency and Sometime it becomes sixteen frame latency so the usability is getting affected. Im using this configuration in videotoolbox's VTCompressionSession: kVTCompressionPropertyKey_AverageBitRate=3MB kVTCompressionPropertyKey_ExpectedFrameRate=24 kVTCompressionPropertyKey_RealTime=true kVTCompressionPropertyKey_ProfileLevel=kVTProfileLevel_H264_High_AutoLevel kVTCompressionPropertyKey_AllowFrameReordering = false kVTCompressionPropertyKey_MaxKeyFrameInterval=1000 With Same Configuration i am able to achieve 1 in - 1 out with com.apple.videotoolbox.videoencoder.h264.gva. This Issue Is replication with Encoder com.apple.videotoolbox.videoencoder.ave.avc Not Sure if its Encoder Specific. I have also seen that there are difference in VUI Parameters between encoded output of both encoders. I want to know if there is something i could do to solve this issue from the Encoder Configuration or another API which is provided by the VideoToolBox to ensure that frames are decoded and rendered at the same time by Decoder. Thanks in Advance....
0
0
855
Mar ’24
How on earth do you compress an MV-HEVC video file in-app without stripping the video of its 3D properties?
So I've been trying for weeks now to implement a compression mechanism into my app project that compresses MV-HEVC video files in-app without stripping videos of their 3D properties, but every single implementation I have tried has either stripped the encoded MV-HEVC video file of its 3D properties (making the video monoscopic), or has crashed with a fatal error. I've read the Reading multiview 3D video files and Converting side-by-side 3D video to multiview HEVC documentation files, but was unable to myself come out with anything useful. My question therefore is: How do you go about compressing/encoding an MV-HEVC video file in-app whilst preserving the stereoscopic/3D properties of that MV-HEVC video file? Below is the best implementation I was able to come up with (which simply compresses uploaded MV-HEVC videos with an arbitrary bit rate). With this implementation (my compressVideo function), the MV-HEVC files that go through it are compressed fine, but the final result is the loss of that MV-HEVC video file's stereoscopic/3D properties. If anyone could point me in the right direction with anything it would be greatly, greatly appreciated. My current implementation (that strips MV-HEVC videos of their stereoscopic/3D properties): static func compressVideo(sourceUrl: URL, bitrate: Int, completion: @escaping (Result<URL, Error>) -> Void) { let asset = AVAsset(url: sourceUrl) asset.loadTracks(withMediaType: .video) { videoTracks, videoError in guard let videoTrack = videoTracks?.first, videoError == nil else { completion(.failure(videoError ?? NSError(domain: "VideoUploader", code: -1, userInfo: [NSLocalizedDescriptionKey: "Failed to load video track"]))) return } asset.loadTracks(withMediaType: .audio) { audioTracks, audioError in guard let audioTrack = audioTracks?.first, audioError == nil else { completion(.failure(audioError ?? NSError(domain: "VideoUploader", code: -2, userInfo: [NSLocalizedDescriptionKey: "Failed to load audio track"]))) return } let outputUrl = sourceUrl.deletingLastPathComponent().appendingPathComponent(UUID().uuidString).appendingPathExtension("mov") guard let assetReader = try? AVAssetReader(asset: asset), let assetWriter = try? AVAssetWriter(outputURL: outputUrl, fileType: .mov) else { completion(.failure(NSError(domain: "VideoUploader", code: -3, userInfo: [NSLocalizedDescriptionKey: "AssetReader/Writer initialization failed"]))) return } let videoReaderSettings: [String: Any] = [kCVPixelBufferPixelFormatTypeKey as String: kCVPixelFormatType_32ARGB] let videoSettings: [String: Any] = [ AVVideoCompressionPropertiesKey: [AVVideoAverageBitRateKey: bitrate], AVVideoCodecKey: AVVideoCodecType.hevc, AVVideoHeightKey: videoTrack.naturalSize.height, AVVideoWidthKey: videoTrack.naturalSize.width ] let assetReaderVideoOutput = AVAssetReaderTrackOutput(track: videoTrack, outputSettings: videoReaderSettings) let assetReaderAudioOutput = AVAssetReaderTrackOutput(track: audioTrack, outputSettings: nil) if assetReader.canAdd(assetReaderVideoOutput) { assetReader.add(assetReaderVideoOutput) } else { completion(.failure(NSError(domain: "VideoUploader", code: -4, userInfo: [NSLocalizedDescriptionKey: "Couldn't add video output reader"]))) return } if assetReader.canAdd(assetReaderAudioOutput) { assetReader.add(assetReaderAudioOutput) } else { completion(.failure(NSError(domain: "VideoUploader", code: -5, userInfo: [NSLocalizedDescriptionKey: "Couldn't add audio output reader"]))) return } let audioInput = AVAssetWriterInput(mediaType: .audio, outputSettings: nil) let videoInput = AVAssetWriterInput(mediaType: .video, outputSettings: videoSettings) videoInput.transform = videoTrack.preferredTransform assetWriter.shouldOptimizeForNetworkUse = true assetWriter.add(videoInput) assetWriter.add(audioInput) assetReader.startReading() assetWriter.startWriting() assetWriter.startSession(atSourceTime: CMTime.zero) let videoInputQueue = DispatchQueue(label: "videoQueue") let audioInputQueue = DispatchQueue(label: "audioQueue") videoInput.requestMediaDataWhenReady(on: videoInputQueue) { while videoInput.isReadyForMoreMediaData { if let sample = assetReaderVideoOutput.copyNextSampleBuffer() { videoInput.append(sample) } else { videoInput.markAsFinished() if assetReader.status == .completed { assetWriter.finishWriting { completion(.success(outputUrl)) } } break } } } audioInput.requestMediaDataWhenReady(on: audioInputQueue) { while audioInput.isReadyForMoreMediaData { if let sample = assetReaderAudioOutput.copyNextSampleBuffer() { audioInput.append(sample) } else { audioInput.markAsFinished() break } } } } } }
9
1
2.0k
Mar ’24
Objective C implementation of Spatial Video(MV-HEVC) Maker
Hi everyone, I need to add spatial video maker in my app which was wrote in objective-c. I found some reference code by swift, can you help me with converting the code to objective -c? let left = CMTaggedBuffer( tags: [.stereoView(.leftEye), .videoLayerID(leftEyeLayerIndex)], pixelBuffer: leftEyeBuffer) let right = CMTaggedBuffer( tags: [.stereoView(.rightEye), .videoLayerID(rightEyeLayerIndex)], pixelBuffer: rightEyeBuffer) let result = adaptor.appendTaggedBuffers( [left, right], withPresentationTime: leftPresentationTs)
2
1
1.1k
Mar ’24
Decompress Video toolbox video on non-Apple hardware?
Does Video Toolbox’s compression session yield data I can decompress on a different device that doesn’t have Apple’s decompression? i.e. so I can network data to other devices that aren’t necessarily Apple? or is the format proprietary rather than just regular h.264 (for example)? If I can decompress without video toolbox, may I have reference to some examples for how to do this using cross-platform APIs? Maybe FFMPEG has something?
1
0
1.1k
Feb ’24
Seeking Advice: Best Apple Laptop for Coding?
Hey Developers, I'm on the hunt for a new Apple laptop geared towards coding, and I'd love to tap into your collective wisdom. If you have recommendations or personal experiences with a specific model that excels in the coding realm, please share your insights. Looking for optimal performance and a seamless coding experience. Your input is gold – thanks a bunch!
2
1
6.2k
Feb ’24
Stereo video HLS
I am trying to set up HLS with MV HEVC. I have an MV HEVC MP4 converted with AVAssetWriter that plays as a "spatial video" in Photos in the simulator. I've used ffmpeg to fragment the video for HLS (sample m3u8 file below). The HLS of the mp4 plays on a VideoMaterial with an AVPlayer in the simulator, but it is hard to determine if the streamed video is stereo. Is there any guidance on confirming that the streamed mp4 video is properly being read as stereo? Additionally, I see that REQ-VIDEO-LAYOUT is required for multivariant HLS. However if there is ONLY stereo video in the playlist is it needed? Are there any other configurations need to make the device read as stereo? Sample m3u8 playlist #EXTM3U #EXT-X-VERSION:3 #EXT-X-TARGETDURATION:13 #EXT-X-MEDIA-SEQUENCE:0 #EXTINF:12.512500, sample_video0.ts #EXTINF:8.341667, sample_video1.ts #EXTINF:12.512500, sample_video2.ts #EXTINF:8.341667, sample_video3.ts #EXTINF:8.341667, sample_video4.ts #EXTINF:12.433222, sample_video5.ts #EXT-X-ENDLIST
5
1
2.4k
Dec ’23
使用VideoTool库进行h264/h265编码
我使用VideoToolBox库进行编码后再通过NDI发送到网络上,是可以成功再苹果电脑上接收到ndi源屏显示画面的,但是在windows上只能ndi源名称,并没有画面显示。 我想知道是不是使用VideoToolBox库无法在windows上进行正确编码,这个问题需要如何解决
0
0
802
Dec ’23
iOS17 the encoder sets an average bit rate of 5Mbps. After one hour of live broadcast, the bit rate will suddenly drop to 1Mbps and cannot be restored.
iOS17, the encoder sets an average bit rate of 5Mbps. in the first 25 minutes:the encoding rate is normal. 25 minutes-30minutes:The encoding bit rate will be reduced to 4M but can be restoredl. 30 minutes-70minutes:the encoding rate is normal. 70minutes-late:the bit rate will suddenly drop to 1Mbps and cannot be restored. As shown in the figure below, the yellow line is the frame rate and the green line is the code rate. The Code show as below - (void)_setBitrate:(NSUInteger)bitrate forSession:(VTCompressionSessionRef)session { NSParameterAssert(session && bitrate); OSStatus status = VTSessionSetProperty(session, kVTCompressionPropertyKey_AverageBitRate, (__bridge CFTypeRef)@(bitrate)); if (status != noErr) NSLog(@"set AverageBitRate error"); NSArray *limit = @[@(bitrate * 1.5/8), @(1)]; status = VTSessionSetProperty(session, kVTCompressionPropertyKey_DataRateLimits, (__bridge CFArrayRef)limit); if (status != noErr) NSLog(@"set DataRateLimits error"); } The problem only occurs in iOS17. Does anyone know what the reason is?
2
0
1k
Dec ’23
Request for Videotoolbox Encoder to Support External Control for Frame Dropping
We are currently working on a real-time, low-latency solution for video conferencing scenarios and have encountered some issues with the current implementation of the encoder. We need a feature enhancement for the Videotoolbox encoder. In our use case, we need to control the encoding quality, which requires setting the maximum encoding QP. However, the kVTCompressionPropertyKey_MaxAllowedFrameQP only takes effect in the kVTVideoEncoderSpecification_EnableLowLatencyRateControl mode. In this mode, when the maximum QP is limited and the bitrate is insufficient, the encoder will drop frames. Our desired scenario is for the encoder to not actively drop frames when the maximum QP is limited. Instead, when the bitrate is insufficient, the encoder should be able to encode the frame with the maximum QP, allowing the frame size to be larger. This would provide a more seamless experience for users in video conferencing situations, where maintaining consistent video quality is crucial. It is worth noting that Android has already implemented this feature in Android 12, which demonstrates the value and feasibility of this enhancement. We kindly request that you consider adding support for external control of frame dropping in the Videotoolbox encoder to accommodate our needs. This enhancement would greatly benefit our project and others that require real-time, low-latency video encoding solutions.
0
0
721
Nov ’23
Unable to manually parse the HEVC video with alpha format
I wish to parse the bitstream of HEVC video with alpha (specific video format reference WWDC2019: https://developer.apple.com/videos/play/wwdc2019/506). Taking the 'puppets_with_alpha_hevc.mov' file from 'Using HEVC Video with Alpha' as an example, I would first extract the HEVC bitstream, then parse its fields. When it comes to the VPS field, as I reach the vps_extension, I find that the bitstream in 'puppets_with_alpha_hevc.mov' does not conform to the HEVC standard document, preventing further parsing. Besides the 'HEVC Video with Alpha Interoperability Profile.pdf', are there any more detailed documents describing the HEVC video with alpha format? Also, is there anyone who can encode or decode HEVC with alpha videos on systems other than macOS?
0
0
782
Nov ’23
Linker Error when using VTCompressionSessionEncodeMultiImageFrame in Swift
I'm working on a MV-HEVC transcoder, based on the VTEncoderForTranscoding sample code. In swift the following code snippet generates a linker error on macOS 14.0 and 14.1. let err = VTCompressionSessionEncodeMultiImageFrame(compressionSession, taggedBuffers: taggedBuffers, presentationTimeStamp: pts, duration: .invalid, frameProperties: nil, infoFlagsOut: nil) { (status: OSStatus, infoFlags: VTEncodeInfoFlags, sbuf: CMSampleBuffer?) -> Void in outputHandler(status, infoFlags, sbuf, thisFrameNumber) } Error: ld: Undefined symbols: VideoToolbox.VTCompressionSessionEncodeMultiImageFrame(_: __C.VTCompressionSessionRef, taggedBuffers: [CoreMedia.CMTaggedBuffer], presentationTimeStamp: __C.CMTime, duration: __C.CMTime, frameProperties: __C.CFDictionaryRef?, infoFlagsOut: Swift.UnsafeMutablePointer<__C.VTEncodeInfoFlags>?, outputHandler: (Swift.Int32, __C.VTEncodeInfoFlags, __C.CMSampleBufferRef?) -> ()) -> Swift.Int32, referenced from: (3) suspend resume partial function for VTEncoderForTranscoding_Swift.(compressFrames in _FE7277D5F28D8DABDFC10EA0164D825D)(from: VTEncoderForTranscoding_Swift.VideoSource, options: VTEncoderForTranscoding_Swift.Options, expectedFrameRate: Swift.Float, outputHandler: @Sendable (Swift.Int32, __C.VTEncodeInfoFlags, __C.CMSampleBufferRef?, Swift.Int) -> ()) async throws -> () in VTEncoderForTranscoding.o Using VTCompressionSessionEncodeMultiImageFrameWithOutputHandler in ObjC doesn't trigger a linker error. Anybody knows how to get it to work in Swift?
0
0
849
Nov ’23
Reducing storage of similar PNGs by compressing them into a video and retrieving them losslessly--possibility or dumb idea?
My app stores and transports lots of groups of similar PNGs. These aren't compressed well by official algorithms like .lzfse, .lz4, .lzbitmap... not even bz2, but I realized that they are well-suited for compression by video codecs since they're highly similar to one another. I ran an experiment where I compressed a dozen images into an HEVCWithAlpha .mov via AVAssetWriter, and the compression ratio was fantastic, but when I retrieved the PNGs via AVAssetImageGenerator there were lots of artifacts which simply wasn't acceptable. Maybe I'm doing something wrong, or maybe I'm chasing something that doesn't exist. Is there a way to use video compression like a specialized archive to store and retrieve PNGs losslessly while retaining alpha? I have no intention of using the videos except as condensed storage. Any suggestions on how to reduce storage size of many large PNGs are also welcome. I also tried using HEVC instead of PNG via the new UIImage.hevcData(), but the decompression/processing times were just insane (5000%+ increase), on top of there being fatal errors when using async.
Replies
18
Boosts
0
Views
2.2k
Activity
Jun ’24
ProRes 4444 blocky compression artifacts
I’m creating a objective C command-line utility to encode RAW image sequences to ProRes 4444, but I’m encountering, blocky compression artifacts in the ProRes 4444 video output. To test the integrity of the image data before encoding to ProRes, I added a snippet in my encoding function that saves a 16-bit PNG before encoding to ProRes and the PNG looks perfect, I can see all detail in every part of the image dynamic range. Here’s a comparison between the 16-bit PNG(on the right) and the ProRes 4444 output. (on the left) As a further test, I re-encoded the ‘test PNG’ to ProRes 4444 using DaVinci Resolve, and the ProRes4444 output video from Resolve doesn’t have any blocky compression artifacts. Looks identical. In short, this is what the utility does: Unpacks the 12-bit raw data into 16-bit values. After unpacking, the raw data is debayered to convert it into a standard color image format (BGR) using OpenCV. Scale the debayered pixel values from their original 12-bit depth to fit into a 16-bit range. Up to this point everything is fine and confirmed by saving 16bit PNGs. The images are encoded to ProRes 4444 using the AVFoundation framework. The pixel buffers are created and managed using dictionary method with ‘kCVPixelFormatType_64RGBALE’. I need help figuring this out, I’m a real novice when it comes to AVfoundation/encoding to ProRes. See relevant parts of my 'encodeToProRes' function: void encodeToProRes(const std::string &outputPath, const std::vector<std::string> &rawPaths, const std::string &proResFlavor) { NSError *error = nil; NSURL *url = [NSURL fileURLWithPath:[NSString stringWithUTF8String:outputPath.c_str()]]; AVAssetWriter *assetWriter = [AVAssetWriter assetWriterWithURL:url fileType:AVFileTypeQuickTimeMovie error:&error]; if (error) { std::cerr << "Error creating AVAssetWriter: " << error.localizedDescription.UTF8String << std::endl; return; } // Load the first image to get the dimensions std::cout << "Debayering the first image to get dimensions..." << std::endl; Mat firstImage; int width = 5320; int height = 3900; if (!debayer_image(rawPaths[0], firstImage, width, height)) { std::cerr << "Error debayering the first image" << std::endl; return; } width = firstImage.cols; height = firstImage.rows; // Save the first frame as a PNG 16-bit image for validation std::string pngFilePath = outputPath + "_frame1.png"; if (!imwrite(pngFilePath, firstImage)) { std::cerr << "Error: Failed to save the first frame as a PNG image" << std::endl; } else { std::cout << "First frame saved as PNG: " << pngFilePath << std::endl; } NSString *codecKey = nil; if (proResFlavor == "4444") { codecKey = AVVideoCodecTypeAppleProRes4444; } else if (proResFlavor == "422HQ") { codecKey = AVVideoCodecTypeAppleProRes422HQ; } else if (proResFlavor == "422") { codecKey = AVVideoCodecTypeAppleProRes422; } else if (proResFlavor == "LT") { codecKey = AVVideoCodecTypeAppleProRes422LT; } else { std::cerr << "Error: Invalid ProRes flavor specified: " << proResFlavor << std::endl; return; } NSDictionary *outputSettings = @{ AVVideoCodecKey: codecKey, AVVideoWidthKey: @(width), AVVideoHeightKey: @(height) }; AVAssetWriterInput *videoInput = [AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeVideo outputSettings:outputSettings]; videoInput.expectsMediaDataInRealTime = YES; NSDictionary *pixelBufferAttributes = @{ (id)kCVPixelBufferPixelFormatTypeKey: @(kCVPixelFormatType_64RGBALE), (id)kCVPixelBufferWidthKey: @(width), (id)kCVPixelBufferHeightKey: @(height) }; AVAssetWriterInputPixelBufferAdaptor *adaptor = [AVAssetWriterInputPixelBufferAdaptor assetWriterInputPixelBufferAdaptorWithAssetWriterInput:videoInput sourcePixelBufferAttributes:pixelBufferAttributes]; ... [assetWriter startSessionAtSourceTime:kCMTimeZero]; CMTime frameDuration = CMTimeMake(1, 24); // Frame rate of 24 fps int numFrames = static_cast<int>(rawPaths.size()); ... // Encoding thread std::thread encoderThread([&]() { int frameIndex = 0; std::vector<CVPixelBufferRef> pixelBufferBuffer; while (frameIndex < numFrames) { std::unique_lock<std::mutex> lock(queueMutex); queueCondVar.wait(lock, [&]() { return !frameQueue.empty() || debayeringFinished; }); if (!frameQueue.empty()) { auto [index, debayeredImage] = frameQueue.front(); frameQueue.pop(); lock.unlock(); if (index == frameIndex) { cv::Mat rgbaImage; cv::cvtColor(debayeredImage, rgbaImage, cv::COLOR_BGR2RGBA); CVPixelBufferRef pixelBuffer = NULL; CVReturn result = CVPixelBufferPoolCreatePixelBuffer(NULL, adaptor.pixelBufferPool, &pixelBuffer); if (result != kCVReturnSuccess) { std::cerr << "Error: Could not create pixel buffer" << std::endl; dispatch_group_leave(dispatchGroup); return; } CVPixelBufferLockBaseAddress(pixelBuffer, 0); void *pxdata = CVPixelBufferGetBaseAddress(pixelBuffer); for (int row = 0; row < height; ++row) { memcpy(static_cast<uint8_t*>(pxdata) + row * CVPixelBufferGetBytesPerRow(pixelBuffer), rgbaImage.ptr(row), width * 8); } CVPixelBufferUnlockBaseAddress(pixelBuffer, 0); pixelBufferBuffer.push_back(pixelBuffer); ... Thanks very much!
Replies
1
Boosts
0
Views
1.3k
Activity
Jun ’24
VideoToolBox on OSX15.0
Hello, Notice that when using h.265 VideoToolBox encoding with Handbrake (latest snapshots) the resulting output is larger than the 264 source. Obviously this was the case for OSX 14.x and below. Is this a known "regression" ? Thanks.
Replies
1
Boosts
0
Views
835
Activity
Jun ’24
AVQT for HDR content
Is AVQT capable of being used to measure encoding quality of PQ or HLG based content beyond SDR? If so, how am I able to leverage it. If not, is there a roadmap for timing to enable this type of tool?
Replies
1
Boosts
0
Views
1.1k
Activity
Jun ’24
iOS to Android H264 encoding issue.
I'm trying to cast the screen from an iOS device to an Android device. I'm leveraging ReplayKit on iOS to capture the screen and VideoToolbox for compressing the captured video data into H.264 format using CMSampleBuffers. Both iOS and Android are configured for H.264 compression and decompression. While screen casting works flawlessly within the same platform (iOS to iOS or Android to Android), I'm encountering an error ("not in avi mode") on the Android receiver when casting from iOS. My research suggests that the underlying container formats for H.264 might differ between iOS and Android. Data transmission over the TCP socket seems to be functioning correctly. My question is: Is there a way to ensure a common container format for H.264 compression and decompression across iOS and Android platforms? Here's a breakdown of the iOS sender details: Device: iPhone 13 mini running iOS 17 Development Environment: Xcode 15 with a minimum deployment target of iOS 16 Screen Capture: ReplayKit for capturing the screen and obtaining CMSampleBuffers Video Compression: VideoToolbox for H.264 compression Compression Properties: kVTCompressionPropertyKey_ConstantBitRate: 6144000 (bitrate) kVTCompressionPropertyKey_ProfileLevel: kVTProfileLevel_H264_Main_AutoLevel (profile and level) kVTCompressionPropertyKey_MaxKeyFrameInterval: 60 (maximum keyframe interval) kVTCompressionPropertyKey_RealTime: true (real-time encoding) kVTCompressionPropertyKey_Quality: 1 (lowest quality) NAL Unit Handling: Custom header is added to NAL units Android Receiver Details: Device: RedMi 7A running Android 10 Video Decoding: MediaCodec API for receiving and decoding the H.264 stream
Replies
0
Boosts
0
Views
1.1k
Activity
May ’24
Question regarding the kVTVideoEncoderList_IsHardwareAccelerated flag
I am a bit confused on whether certain Video Toolbox (VT) encoders support hardware acceleration or not. When I query the list of VT encoders (VTCopyVideoEncoderList(nil,&encoderList)) on an iPhone 14 Pro device, for avc1 (AVC / H.264) and hevc1 (HEVC / H.265) encoders, the kVTVideoEncoderList_IsHardwareAccelerated flag is not there, which -based on the documentation found on the VTVideoEncoderList.h- means that the encoders do not support hardware acceleration: optional. CFBoolean. If present and set to kCFBooleanTrue, indicates that the encoder is hardware accelerated. In fact, no encoders from this list return this flag as true and most of them do not include the flag at all on their dictionaries. On the other hand, when I create a compression session using the VTCompressionSessionCreate() and pass the kVTVideoEncoderSpecification_EnableHardwareAcceleratedVideoEncoder as true in the encoder specifications, after querying the kVTCompressionPropertyKey_UsingHardwareAcceleratedVideoEncoder using the following code, I get a CFBoolean value of true for both H.264 and H.265 encoder. In fact, I get a true value (for both of the aforementioned encoders) even if I don't specify the kVTVideoEncoderSpecification_EnableHardwareAcceleratedVideoEncoder during the creation of the compression session (note here that this flag was introduced in iOS 17.4 ^1). So the question is: Are those encoders actually hardware accelerated on my device, and if so, why isn't that reflected on the VTCopyVideoEncoderList() call?
Replies
3
Boosts
1
Views
1.2k
Activity
May ’24
iOS App crashes while converting CVPixelbuffer to JPG
I have been seeing some crash reports for my app on some devices (not all of them). The crash occurs while converting a CVPixelBuffer captured from Video to a JPG using VTCreateCGImageFromCVPixelBuffer from VideoToolBox. I have not been able to reproduce the crash on local devices, even under adverse memory conditions (many apps running in the background). The field crash reports show that VTCreateCGImageFromCVPixelBuffer does the conversion in another thread and that thread crashed at call to vConvert_420Yp8_CbCr8ToARGB8888_vec. Any suggestions on how to debug this further would be helpful.
Replies
2
Boosts
0
Views
1.3k
Activity
May ’24
Does CVE-2024-1580 affect my app?
I have an image viewing app with support for avif (and avis) images. I'm trying to figure out if the recent bug in CoreMedia (dav1d) affects my app. The apple security update: https://support.apple.com/en-gb/HT214097 The vulnerable code path in dav1d is only reached when c->n_fc > 1 (https://code.videolan.org/videolan/dav1d/-/blob/2b475307dc11be9a1c3cc4358102c76a7f386a51/src/decode.c#L2845), where c is the dav1d context. With some reverse engineering, the way I see CMPhoto calling into VideoToolBox (which internally calls into AV1SW.videodecoder, which is a wrapper around dav1d), the max frame delay is hardcoded to 1 in the dav1d settings which intern means that c->n_fc in dav1d is always 1. The vulnerable code path in dav1d is only reached when c->n_fc > 1 (https://code.videolan.org/videolan/dav1d/-/blob/2b475307dc11be9a1c3cc4358102c76a7f386a51/src/decode.c#L2845). From my understand, this should mean that my app isn't affected. The apple security update however clearly mentions that "Processing an image may lead to arbitrary code execution". Surely I'm missing something?
Replies
0
Boosts
0
Views
927
Activity
Apr ’24
Sometimes when video is encoded using the Video Toolbox Encoder for web live streaming, the decoder output always has a 4 frame latency
I am Using VideoToolbox VTCompressionSession For Encoding The Frame in H264 Format Which I will send through web socket to a browser. The Received Frames Will be Decoded and Output Will be rendered in the Website. Now, when using Some encoders the video is rendered always with four frame latency. How Frame is sent to server : start>------------ f1 ------------ f2 ------------ f3 ------------ f4 ------------- f5 ... How rendering is happening : start>-------------------------------------------------------------------------- f1 ------------ f2 ------------ f3 ------------ f4 ----------- ... This Sometime becomes two frame latency and Sometime it becomes sixteen frame latency so the usability is getting affected. Im using this configuration in videotoolbox's VTCompressionSession: kVTCompressionPropertyKey_AverageBitRate=3MB kVTCompressionPropertyKey_ExpectedFrameRate=24 kVTCompressionPropertyKey_RealTime=true kVTCompressionPropertyKey_ProfileLevel=kVTProfileLevel_H264_High_AutoLevel kVTCompressionPropertyKey_AllowFrameReordering = false kVTCompressionPropertyKey_MaxKeyFrameInterval=1000 With Same Configuration i am able to achieve 1 in - 1 out with com.apple.videotoolbox.videoencoder.h264.gva. This Issue Is replication with Encoder com.apple.videotoolbox.videoencoder.ave.avc Not Sure if its Encoder Specific. I have also seen that there are difference in VUI Parameters between encoded output of both encoders. I want to know if there is something i could do to solve this issue from the Encoder Configuration or another API which is provided by the VideoToolBox to ensure that frames are decoded and rendered at the same time by Decoder. Thanks in Advance....
Replies
0
Boosts
0
Views
855
Activity
Mar ’24
How on earth do you compress an MV-HEVC video file in-app without stripping the video of its 3D properties?
So I've been trying for weeks now to implement a compression mechanism into my app project that compresses MV-HEVC video files in-app without stripping videos of their 3D properties, but every single implementation I have tried has either stripped the encoded MV-HEVC video file of its 3D properties (making the video monoscopic), or has crashed with a fatal error. I've read the Reading multiview 3D video files and Converting side-by-side 3D video to multiview HEVC documentation files, but was unable to myself come out with anything useful. My question therefore is: How do you go about compressing/encoding an MV-HEVC video file in-app whilst preserving the stereoscopic/3D properties of that MV-HEVC video file? Below is the best implementation I was able to come up with (which simply compresses uploaded MV-HEVC videos with an arbitrary bit rate). With this implementation (my compressVideo function), the MV-HEVC files that go through it are compressed fine, but the final result is the loss of that MV-HEVC video file's stereoscopic/3D properties. If anyone could point me in the right direction with anything it would be greatly, greatly appreciated. My current implementation (that strips MV-HEVC videos of their stereoscopic/3D properties): static func compressVideo(sourceUrl: URL, bitrate: Int, completion: @escaping (Result<URL, Error>) -> Void) { let asset = AVAsset(url: sourceUrl) asset.loadTracks(withMediaType: .video) { videoTracks, videoError in guard let videoTrack = videoTracks?.first, videoError == nil else { completion(.failure(videoError ?? NSError(domain: "VideoUploader", code: -1, userInfo: [NSLocalizedDescriptionKey: "Failed to load video track"]))) return } asset.loadTracks(withMediaType: .audio) { audioTracks, audioError in guard let audioTrack = audioTracks?.first, audioError == nil else { completion(.failure(audioError ?? NSError(domain: "VideoUploader", code: -2, userInfo: [NSLocalizedDescriptionKey: "Failed to load audio track"]))) return } let outputUrl = sourceUrl.deletingLastPathComponent().appendingPathComponent(UUID().uuidString).appendingPathExtension("mov") guard let assetReader = try? AVAssetReader(asset: asset), let assetWriter = try? AVAssetWriter(outputURL: outputUrl, fileType: .mov) else { completion(.failure(NSError(domain: "VideoUploader", code: -3, userInfo: [NSLocalizedDescriptionKey: "AssetReader/Writer initialization failed"]))) return } let videoReaderSettings: [String: Any] = [kCVPixelBufferPixelFormatTypeKey as String: kCVPixelFormatType_32ARGB] let videoSettings: [String: Any] = [ AVVideoCompressionPropertiesKey: [AVVideoAverageBitRateKey: bitrate], AVVideoCodecKey: AVVideoCodecType.hevc, AVVideoHeightKey: videoTrack.naturalSize.height, AVVideoWidthKey: videoTrack.naturalSize.width ] let assetReaderVideoOutput = AVAssetReaderTrackOutput(track: videoTrack, outputSettings: videoReaderSettings) let assetReaderAudioOutput = AVAssetReaderTrackOutput(track: audioTrack, outputSettings: nil) if assetReader.canAdd(assetReaderVideoOutput) { assetReader.add(assetReaderVideoOutput) } else { completion(.failure(NSError(domain: "VideoUploader", code: -4, userInfo: [NSLocalizedDescriptionKey: "Couldn't add video output reader"]))) return } if assetReader.canAdd(assetReaderAudioOutput) { assetReader.add(assetReaderAudioOutput) } else { completion(.failure(NSError(domain: "VideoUploader", code: -5, userInfo: [NSLocalizedDescriptionKey: "Couldn't add audio output reader"]))) return } let audioInput = AVAssetWriterInput(mediaType: .audio, outputSettings: nil) let videoInput = AVAssetWriterInput(mediaType: .video, outputSettings: videoSettings) videoInput.transform = videoTrack.preferredTransform assetWriter.shouldOptimizeForNetworkUse = true assetWriter.add(videoInput) assetWriter.add(audioInput) assetReader.startReading() assetWriter.startWriting() assetWriter.startSession(atSourceTime: CMTime.zero) let videoInputQueue = DispatchQueue(label: "videoQueue") let audioInputQueue = DispatchQueue(label: "audioQueue") videoInput.requestMediaDataWhenReady(on: videoInputQueue) { while videoInput.isReadyForMoreMediaData { if let sample = assetReaderVideoOutput.copyNextSampleBuffer() { videoInput.append(sample) } else { videoInput.markAsFinished() if assetReader.status == .completed { assetWriter.finishWriting { completion(.success(outputUrl)) } } break } } } audioInput.requestMediaDataWhenReady(on: audioInputQueue) { while audioInput.isReadyForMoreMediaData { if let sample = assetReaderAudioOutput.copyNextSampleBuffer() { audioInput.append(sample) } else { audioInput.markAsFinished() break } } } } } }
Replies
9
Boosts
1
Views
2.0k
Activity
Mar ’24
Objective C implementation of Spatial Video(MV-HEVC) Maker
Hi everyone, I need to add spatial video maker in my app which was wrote in objective-c. I found some reference code by swift, can you help me with converting the code to objective -c? let left = CMTaggedBuffer( tags: [.stereoView(.leftEye), .videoLayerID(leftEyeLayerIndex)], pixelBuffer: leftEyeBuffer) let right = CMTaggedBuffer( tags: [.stereoView(.rightEye), .videoLayerID(rightEyeLayerIndex)], pixelBuffer: rightEyeBuffer) let result = adaptor.appendTaggedBuffers( [left, right], withPresentationTime: leftPresentationTs)
Replies
2
Boosts
1
Views
1.1k
Activity
Mar ’24
Decompress Video toolbox video on non-Apple hardware?
Does Video Toolbox’s compression session yield data I can decompress on a different device that doesn’t have Apple’s decompression? i.e. so I can network data to other devices that aren’t necessarily Apple? or is the format proprietary rather than just regular h.264 (for example)? If I can decompress without video toolbox, may I have reference to some examples for how to do this using cross-platform APIs? Maybe FFMPEG has something?
Replies
1
Boosts
0
Views
1.1k
Activity
Feb ’24
Seeking Advice: Best Apple Laptop for Coding?
Hey Developers, I'm on the hunt for a new Apple laptop geared towards coding, and I'd love to tap into your collective wisdom. If you have recommendations or personal experiences with a specific model that excels in the coding realm, please share your insights. Looking for optimal performance and a seamless coding experience. Your input is gold – thanks a bunch!
Replies
2
Boosts
1
Views
6.2k
Activity
Feb ’24
How to decode the space video to the left and right format?
I am using the official website API to decode now, the callback did not trigger.
Replies
1
Boosts
0
Views
702
Activity
Dec ’23
Stereo video HLS
I am trying to set up HLS with MV HEVC. I have an MV HEVC MP4 converted with AVAssetWriter that plays as a "spatial video" in Photos in the simulator. I've used ffmpeg to fragment the video for HLS (sample m3u8 file below). The HLS of the mp4 plays on a VideoMaterial with an AVPlayer in the simulator, but it is hard to determine if the streamed video is stereo. Is there any guidance on confirming that the streamed mp4 video is properly being read as stereo? Additionally, I see that REQ-VIDEO-LAYOUT is required for multivariant HLS. However if there is ONLY stereo video in the playlist is it needed? Are there any other configurations need to make the device read as stereo? Sample m3u8 playlist #EXTM3U #EXT-X-VERSION:3 #EXT-X-TARGETDURATION:13 #EXT-X-MEDIA-SEQUENCE:0 #EXTINF:12.512500, sample_video0.ts #EXTINF:8.341667, sample_video1.ts #EXTINF:12.512500, sample_video2.ts #EXTINF:8.341667, sample_video3.ts #EXTINF:8.341667, sample_video4.ts #EXTINF:12.433222, sample_video5.ts #EXT-X-ENDLIST
Replies
5
Boosts
1
Views
2.4k
Activity
Dec ’23
使用VideoTool库进行h264/h265编码
我使用VideoToolBox库进行编码后再通过NDI发送到网络上,是可以成功再苹果电脑上接收到ndi源屏显示画面的,但是在windows上只能ndi源名称,并没有画面显示。 我想知道是不是使用VideoToolBox库无法在windows上进行正确编码,这个问题需要如何解决
Replies
0
Boosts
0
Views
802
Activity
Dec ’23
iOS17 the encoder sets an average bit rate of 5Mbps. After one hour of live broadcast, the bit rate will suddenly drop to 1Mbps and cannot be restored.
iOS17, the encoder sets an average bit rate of 5Mbps. in the first 25 minutes:the encoding rate is normal. 25 minutes-30minutes:The encoding bit rate will be reduced to 4M but can be restoredl. 30 minutes-70minutes:the encoding rate is normal. 70minutes-late:the bit rate will suddenly drop to 1Mbps and cannot be restored. As shown in the figure below, the yellow line is the frame rate and the green line is the code rate. The Code show as below - (void)_setBitrate:(NSUInteger)bitrate forSession:(VTCompressionSessionRef)session { NSParameterAssert(session && bitrate); OSStatus status = VTSessionSetProperty(session, kVTCompressionPropertyKey_AverageBitRate, (__bridge CFTypeRef)@(bitrate)); if (status != noErr) NSLog(@"set AverageBitRate error"); NSArray *limit = @[@(bitrate * 1.5/8), @(1)]; status = VTSessionSetProperty(session, kVTCompressionPropertyKey_DataRateLimits, (__bridge CFArrayRef)limit); if (status != noErr) NSLog(@"set DataRateLimits error"); } The problem only occurs in iOS17. Does anyone know what the reason is?
Replies
2
Boosts
0
Views
1k
Activity
Dec ’23
Request for Videotoolbox Encoder to Support External Control for Frame Dropping
We are currently working on a real-time, low-latency solution for video conferencing scenarios and have encountered some issues with the current implementation of the encoder. We need a feature enhancement for the Videotoolbox encoder. In our use case, we need to control the encoding quality, which requires setting the maximum encoding QP. However, the kVTCompressionPropertyKey_MaxAllowedFrameQP only takes effect in the kVTVideoEncoderSpecification_EnableLowLatencyRateControl mode. In this mode, when the maximum QP is limited and the bitrate is insufficient, the encoder will drop frames. Our desired scenario is for the encoder to not actively drop frames when the maximum QP is limited. Instead, when the bitrate is insufficient, the encoder should be able to encode the frame with the maximum QP, allowing the frame size to be larger. This would provide a more seamless experience for users in video conferencing situations, where maintaining consistent video quality is crucial. It is worth noting that Android has already implemented this feature in Android 12, which demonstrates the value and feasibility of this enhancement. We kindly request that you consider adding support for external control of frame dropping in the Videotoolbox encoder to accommodate our needs. This enhancement would greatly benefit our project and others that require real-time, low-latency video encoding solutions.
Replies
0
Boosts
0
Views
721
Activity
Nov ’23
Unable to manually parse the HEVC video with alpha format
I wish to parse the bitstream of HEVC video with alpha (specific video format reference WWDC2019: https://developer.apple.com/videos/play/wwdc2019/506). Taking the 'puppets_with_alpha_hevc.mov' file from 'Using HEVC Video with Alpha' as an example, I would first extract the HEVC bitstream, then parse its fields. When it comes to the VPS field, as I reach the vps_extension, I find that the bitstream in 'puppets_with_alpha_hevc.mov' does not conform to the HEVC standard document, preventing further parsing. Besides the 'HEVC Video with Alpha Interoperability Profile.pdf', are there any more detailed documents describing the HEVC video with alpha format? Also, is there anyone who can encode or decode HEVC with alpha videos on systems other than macOS?
Replies
0
Boosts
0
Views
782
Activity
Nov ’23
Linker Error when using VTCompressionSessionEncodeMultiImageFrame in Swift
I'm working on a MV-HEVC transcoder, based on the VTEncoderForTranscoding sample code. In swift the following code snippet generates a linker error on macOS 14.0 and 14.1. let err = VTCompressionSessionEncodeMultiImageFrame(compressionSession, taggedBuffers: taggedBuffers, presentationTimeStamp: pts, duration: .invalid, frameProperties: nil, infoFlagsOut: nil) { (status: OSStatus, infoFlags: VTEncodeInfoFlags, sbuf: CMSampleBuffer?) -> Void in outputHandler(status, infoFlags, sbuf, thisFrameNumber) } Error: ld: Undefined symbols: VideoToolbox.VTCompressionSessionEncodeMultiImageFrame(_: __C.VTCompressionSessionRef, taggedBuffers: [CoreMedia.CMTaggedBuffer], presentationTimeStamp: __C.CMTime, duration: __C.CMTime, frameProperties: __C.CFDictionaryRef?, infoFlagsOut: Swift.UnsafeMutablePointer<__C.VTEncodeInfoFlags>?, outputHandler: (Swift.Int32, __C.VTEncodeInfoFlags, __C.CMSampleBufferRef?) -> ()) -> Swift.Int32, referenced from: (3) suspend resume partial function for VTEncoderForTranscoding_Swift.(compressFrames in _FE7277D5F28D8DABDFC10EA0164D825D)(from: VTEncoderForTranscoding_Swift.VideoSource, options: VTEncoderForTranscoding_Swift.Options, expectedFrameRate: Swift.Float, outputHandler: @Sendable (Swift.Int32, __C.VTEncodeInfoFlags, __C.CMSampleBufferRef?, Swift.Int) -> ()) async throws -> () in VTEncoderForTranscoding.o Using VTCompressionSessionEncodeMultiImageFrameWithOutputHandler in ObjC doesn't trigger a linker error. Anybody knows how to get it to work in Swift?
Replies
0
Boosts
0
Views
849
Activity
Nov ’23