Core Video

RSS for tag

Process digital video using a pipeline-based API and support for both Metal and OpenGL using Core Video.

Posts under Core Video tag

3 Posts

Post

Replies

Boosts

Views

Activity

Working with kCVPixelFormatType_96VersatileBayerPacked12
Whilst AVCaptureSession is setup to capture ProRes RAW video, is it possible to get video pixel data which can read and processed, such as using CIImage(cvPixelBuffer: ) AVCaptureVideoDataOutput outputs ProRes RAW in kCVPixelFormatType_96VersatileBayerPacked12 pixel format. Is there a provided way to debayer this pixel format into something more usable?
0
0
47
4d
Core Image for depth maps & segmentation masks: numeric fidelity issues when rendering CIImage to CVPixelBuffer (looking for Architecture suggestions)
Hello All, I’m working on a computer-vision–heavy iOS application that uses the camera, LiDAR depth maps, and semantic segmentation to reason about the environment (object identification, localization and measurement - not just visualization). Current architecture I initially built the image pipeline around CIImage as a unifying abstraction. It seemed like a good idea because: CIImage integrates cleanly with Vision, ARKit, AVFoundation, Metal, Core Graphics, etc. It provides a rich set of out-of-the-box transforms and filters. It is immutable and thread-safe, which significantly simplified concurrency in a multi-queue pipeline. The LiDAR depth maps, semantic segmentation masks, etc. were treated as CIImages, with conversion to CVPixelBuffer or MTLTexture only at the edges when required. Problem I’ve run into cases where Core Image transformations do not preserve numeric fidelity for non-visual data. Example: Rendering a CIImage-backed segmentation mask into a larger CVPixelBuffer can cause label values to change in predictable but incorrect ways. This occurs even when: using nearest-neighbor sampling disabling color management (workingColorSpace / outputColorSpace = NSNull) applying identity or simple affine transforms I’ve confirmed via controlled tests that: Metal → CVPixelBuffer paths preserve values correctly CIImage → CVPixelBuffer paths can introduce value changes when resampling or expanding the render target This makes CIImage unsafe as a source of numeric truth for segmentation masks and depth-based logic, even though it works well for visualization, and I should have realized this much sooner. Direction I’m considering I’m now considering refactoring toward more intent-based abstractions instead of a single image type, for example: Visual images: CIImage (camera frames, overlays, debugging, UI) Scalar fields: depth / confidence maps backed by CVPixelBuffer + Metal Label maps: segmentation masks backed by integer-preserving buffers (no interpolation, no transforms) In this model, CIImage would still be used extensively — but primarily for visualization and perceptual processing, not as the container for numerically sensitive data. Thread safety concern One of the original advantages of CIImage was that it is thread-safe by design, and that was my biggest incentive. For CVPixelBuffer / MTLTexture–backed data, I’m considering enforcing thread safety explicitly via: Swift Concurrency (actor-owned data, explicit ownership) Questions For those may have experience with CV / AR / imaging-heavy iOS apps, I was hoping to know the following: Is this separation of image intent (visual vs numeric vs categorical) a reasonable architectural direction? Do you generally keep CIImage at the heart of your pipeline, or push it to the edges (visualization only)? How do you manage thread safety and ownership when working heavily with CVPixelBuffer and Metal? Using actor-based abstractions, GCD, or adhoc? Are there any best practices or gotchas around using Core Image with depth maps or segmentation masks that I should be aware of? I’d really appreciate any guidance or experience-based advice. I suspect I’ve hit a boundary of Core Image’s design, and I’m trying to refactor in a way that doesn't involve too much immediate tech debt, remains robust and maintainable long-term. Thank you in advance!
2
0
490
Feb ’26
Video in "Made for iPad" apps on macOS
I'm relatively new to Swift development (and native iOS development for that matter) I've got an iOS app that uses the iPhone / iPad built in cameras, and am looking to make this more compatible with macOS. Using the normal AVCaptureDevice.DiscoverySession I seem to get the iPhone Continuity Camera and the in-built MacBook Pro camera but I don't see other input devices that I see in QuickTime Player (for example) such as connected external cameras or Virtual Inputs provided by NDI Virtual Input and OBS. Is there a way to see access these without a specific Mac build (as the rest of the functionality works great, and I'd rather not diverge the codebase too much as it's easier to update one app than two!
0
0
322
Feb ’26
Working with kCVPixelFormatType_96VersatileBayerPacked12
Whilst AVCaptureSession is setup to capture ProRes RAW video, is it possible to get video pixel data which can read and processed, such as using CIImage(cvPixelBuffer: ) AVCaptureVideoDataOutput outputs ProRes RAW in kCVPixelFormatType_96VersatileBayerPacked12 pixel format. Is there a provided way to debayer this pixel format into something more usable?
Replies
0
Boosts
0
Views
47
Activity
4d
Core Image for depth maps & segmentation masks: numeric fidelity issues when rendering CIImage to CVPixelBuffer (looking for Architecture suggestions)
Hello All, I’m working on a computer-vision–heavy iOS application that uses the camera, LiDAR depth maps, and semantic segmentation to reason about the environment (object identification, localization and measurement - not just visualization). Current architecture I initially built the image pipeline around CIImage as a unifying abstraction. It seemed like a good idea because: CIImage integrates cleanly with Vision, ARKit, AVFoundation, Metal, Core Graphics, etc. It provides a rich set of out-of-the-box transforms and filters. It is immutable and thread-safe, which significantly simplified concurrency in a multi-queue pipeline. The LiDAR depth maps, semantic segmentation masks, etc. were treated as CIImages, with conversion to CVPixelBuffer or MTLTexture only at the edges when required. Problem I’ve run into cases where Core Image transformations do not preserve numeric fidelity for non-visual data. Example: Rendering a CIImage-backed segmentation mask into a larger CVPixelBuffer can cause label values to change in predictable but incorrect ways. This occurs even when: using nearest-neighbor sampling disabling color management (workingColorSpace / outputColorSpace = NSNull) applying identity or simple affine transforms I’ve confirmed via controlled tests that: Metal → CVPixelBuffer paths preserve values correctly CIImage → CVPixelBuffer paths can introduce value changes when resampling or expanding the render target This makes CIImage unsafe as a source of numeric truth for segmentation masks and depth-based logic, even though it works well for visualization, and I should have realized this much sooner. Direction I’m considering I’m now considering refactoring toward more intent-based abstractions instead of a single image type, for example: Visual images: CIImage (camera frames, overlays, debugging, UI) Scalar fields: depth / confidence maps backed by CVPixelBuffer + Metal Label maps: segmentation masks backed by integer-preserving buffers (no interpolation, no transforms) In this model, CIImage would still be used extensively — but primarily for visualization and perceptual processing, not as the container for numerically sensitive data. Thread safety concern One of the original advantages of CIImage was that it is thread-safe by design, and that was my biggest incentive. For CVPixelBuffer / MTLTexture–backed data, I’m considering enforcing thread safety explicitly via: Swift Concurrency (actor-owned data, explicit ownership) Questions For those may have experience with CV / AR / imaging-heavy iOS apps, I was hoping to know the following: Is this separation of image intent (visual vs numeric vs categorical) a reasonable architectural direction? Do you generally keep CIImage at the heart of your pipeline, or push it to the edges (visualization only)? How do you manage thread safety and ownership when working heavily with CVPixelBuffer and Metal? Using actor-based abstractions, GCD, or adhoc? Are there any best practices or gotchas around using Core Image with depth maps or segmentation masks that I should be aware of? I’d really appreciate any guidance or experience-based advice. I suspect I’ve hit a boundary of Core Image’s design, and I’m trying to refactor in a way that doesn't involve too much immediate tech debt, remains robust and maintainable long-term. Thank you in advance!
Replies
2
Boosts
0
Views
490
Activity
Feb ’26
Video in "Made for iPad" apps on macOS
I'm relatively new to Swift development (and native iOS development for that matter) I've got an iOS app that uses the iPhone / iPad built in cameras, and am looking to make this more compatible with macOS. Using the normal AVCaptureDevice.DiscoverySession I seem to get the iPhone Continuity Camera and the in-built MacBook Pro camera but I don't see other input devices that I see in QuickTime Player (for example) such as connected external cameras or Virtual Inputs provided by NDI Virtual Input and OBS. Is there a way to see access these without a specific Mac build (as the rest of the functionality works great, and I'd rather not diverge the codebase too much as it's easier to update one app than two!
Replies
0
Boosts
0
Views
322
Activity
Feb ’26