I have the following setup:
An AVCaptureSession with a AVCaptureVideoDataOutput delivers video frames from the camera.
OpenGL textures are created from the CVPixelBuffers using a CVOpenGLESTextureCache.
Some OpenGL-based image processing is performed on the frames (with many intermediate steps) in a separate queue.
The final texture of the processing pipeline is rendered into a CAEAGLLayer on the main thread (with proper context and share group handling).
This worked very well up to iOS 13. Now in iOS 14 the AVCaptureVideoDataOutput suddenly stops delivering new frames (to the delegate) after ~4 sec. of capture—without any warning or log message.
Some observations:
The AVCaptureSession is still running (isRunning is true, isInterrupted is false).
All connections between the camera device and output are still there and active.
The capture indicator (green circle in the status bar, new in iOS 14) is still there.
The output's delegate does not report any frame drops.
When I perform an action that causes the session to be re-configured (like switching to the front camera), the output will start delivering frames again for ~4 sec. and then stop again.
When I don't process and display the frames, the output continues to deliver frames without interruption.
I'm debugging this for a while now and I'm pretty clueless. Any hints or ideas on what might cause this behavior now in iOS 14 are much appreciated! 🙂
Post not yet marked as solved
Extending the PencilKit APIs to expose the inner data structures was a great move! And for apps like the "Handwriting Tutor" sample app this is enough.
However, I'd love to implement a custom drawing engine based on PKDrawing since it provides a lot of functionality out of the box (user interaction through PKCanvasView, de-/serialization, spline interpolation). But for a custom renderer, two key parts are missing: Custom inks (FB8261616), so we can define custom brushes that render differently than the three system styles.
Detecting changes while the user is drawing (FB8261554), otherwise we can't draw their current stroke on screen.
I know this was mentioned here before, but I wanted to emphasize that those two features would enable us to implement a full custom render engine based on PencilKit.
Thanks for considering!
Post not yet marked as solved
Some of the filters that can be created using the CIFilterBuiltins extensions cause runtime exceptions when assigning an inputImage to them:
NSInvalidArgumentException: "-[CIComicEffect setInputImage:]: unrecognized selector sent to instance ..." I tried a few and found this to be the case for CIFilter.comicEffect(), CIFilter.cmykHalftone(), and CIFilter.pointillize() (probably more). Other filters like CIFilter.gaussianBlur() work fine.
This happens in Xcode 11.5 and Xcode 12 beta 2 on iOS and macOS.
I already filed feedback for this (FB8013603).
Post marked as Apple Recommended
In his talk "Build Metal-based Core Image kernels with Xcode", David presents the build phases necessary to compile Core Image Metal files into Metal libraries, that can be used to instantiate CIKernels.
There is a 1-to-1 mapping between .ci.metal file and .ci.metallib file. I also found that the Metal linker doesn't allow to link more then one .air file into one library when building for Core Image.
This works fine until I want to have some common code (such as math functions) extracted into another file to be used by multiple kernels. As soon as I have two color kernels (that get concatenated during filter execution) that use the same shared functions, the runtime Metal compiler crashes (I assume because of duplicate symbols in the merged libraries).
Is there a good way to extract common functionality to be usable by multiple kernels in a pipeline?
Post not yet marked as solved
From the iOS 13 release notes:
Metal CIKernel instances support arguments with arbitrarily structured data. How does this work? Is there any example code for this?
So far I was only able to pass float-typed literals, CIVectors, NSNumbers , CIImages, and CISamplers into kernels as arguments when calling apply.
I set up my AVCaptureSession for photo capture with depth data. In my AVCapturePhotoCaptureDelegate I get the AVCapturePhoto of the capture that contains the depth data.
I call fileDataRepresentation() on it and later use a PHAssetCreationRequest to save the image (including the depth data) to a new asset in Photos.
When loading the image and its depth data later again, the depth data seemed compressed. I observe some heavy quantization of the data.
Is there a way to avoid this compression? Do I need to use specific settings or even a different API for exporting the image?
Post not yet marked as solved
In his talk, David mentioned the updated documentation for built-in Core Image filters, which got me very excited. However, I was not able to find it online or in Xcode.
I just found out that it's only visible when switching the language to Objective-C (for instance here - https://developer.apple.com/documentation/coreimage/cifilter/3228331-gaussianblurfilter?language=objc). From the screenshot in the presentation it looks like it should be available for Swift as well, though.
It also seems that very few filters are documented yet.
Is support for Swift and more documentation coming before release? That would be very helpful!
Is it possible to set up a AVCaptureSession in a way that it will deliver 32-bit depth data (instead of 16-bit) during a photo capture?
I configured the AVCapturePhotoOutput and the AVCapturePhotoSettings to deliver depth data. And it works: my delegate receives a AVDepthData block… containing 16-bit depth data.
I tried setting the AVCaptureDevice's activeDepthDataFormat to a 32-bit format, but the format of the delivered AVDepthData is still only 16-bit—regardless of which format I set on the device.
For video capture using an AVCaptureDepthDataOutput this seem to work, just not for an AVCapturePhotoOutput.
Any hints are appreciated. 🙂
Post not yet marked as solved
What is the correct way to set up Core Image for processing (and preserving) wide gamut images?
I understand that there are four options for the workingColorSpace:
displayP3, extendedLinearDisplayP3, extendedSRGB, and extendedLinearSRGB.
While I understand the implications of all of them (linear vs. sRGB gamma curve and Display P3 vs. sRGB primaries), I don't know which is recommended or best practice to use. Also, might there be compatibility issues with the built-in CI filters? Do they make assumptions about the working color space?
Further, what's the recommended color space to use for the rendering destination? I assume Display P3 since it's also the color space of photos taken with the iPhone camera...
Considering the workingFormat: While I understand that it makes sense to use a 16-bit float type format (RGBAh) for extended range, it also seems very costly.
Would it be somehow possible (and advisable) set up the CIContext to use an 8-bit format while still preserving wide gamut?
Are there differences or special considerations for the different platforms for both, color space and format?
(Sorry for the many questions, but they seem all related…)
Post not yet marked as solved
When setting up a CIContext, one can specify the workingColorSpace. The color space also specifies which gamma curve is used (usually sRGB or linear).
When not explicitly setting a color space, Core Image uses a linear curve. It also says this in the (pretty outdated) Core Image Programming Guide - https://developer.apple.com/library/archive/documentation/GraphicsImaging/Conceptual/CoreImaging/ci_advanced_concepts/ci.advanced_concepts.html#//apple_ref/doc/uid/TP30001185-CH9-SW14:
By default, Core Image assumes that processing nodes are 128 bits-per-pixel, linear light, premultiplied RGBA floating-point values that use the GenericRGB color space. Now I'm wondering if this makes sense in most scenarios.
For instance, if I blur a checkerboard patter with a CIGaussianBlur filter with a default CIContext, I get a different result than when using a non-linear sRGB color space. See here - https://www.icloud.com/keynote/0FLvnwEPx-dkn95dMorENGa0w#Presentation.
White gets clearly more weight than black with linear gamma. Which makes sense, I suppose. But I find that the non-linear (sRGB) result looks "more correct".
What are best practices here? When should the gamma curve be a consideration?
Post not yet marked as solved
Sorry for cross-posting, but I'm curious to hear if one of you has input on this:https://stackoverflow.com/questions/57266811/understanding-output-extend-of-convolution-kernels-when-blendingThanks!
Post not yet marked as solved
I'm writing custom Core Image filters and I'm having a hard time really understanding the extent parameter of CIKernel's apply method.In all documentation and WWDC talks I found so far it's described as the "domain of definition of the kernel", so the area for which the kernel produces meaningful, non-zero results.From that definition I would assume that the extent of the output of a convolution kernel is the same as the extent of the input image, because a convolution always combins multiple input values into one output value. But in the examples that I found and from observations of behavior of the built-in kernels such as CIGaussianBlur, the output extent is always larger than the input (depending on the size if the convolution kernel).I don't understand why. Why should the kernel produce results for pixels that lie outside of the original input domain?
Post not yet marked as solved
I’m currently writing some custom Core Image filters using Metal. For the sake of structure I want to put the different kernels into different .metal files with some common includes like you would do with “normal” source files.However, when the metallib tool bundles the different .air files created by the Metal compiler into one .metallib file, only the kernel functions defined in the first input .air file given to metallib are visible. Functions from the other .air files don’t seem to be included. What’s the reason for this?I thought (as is the default compilation behavior for Metal files) all Metal sources get compiled into one library that is then used by every custom CIFilter class to instantiate their internal CIKernel with the function they need.I now ended up compiling a .metallib file for each custom filter with custom build rules and copying all of them them into my framework within a custom build phase. This don’t seem to be the intended way…
Post not yet marked as solved
I'm trying to integrate my neural style transfer models into Core Image using the new CICoreMLModelFilter. It is working, however, I noticed a few problems with the filter:Aside from Session 719 of this year's WWDC it's not mentionend at all in any piece of documentation.It leaks a lot of memory with every call to outputImage. I dug a little deeper into the Memory Graph and found that a new CIPredictionModel along with a heavy IOSurface and various other objects is created with each call and never released.I just uses scale-to-fit on the input image to match the MLModel's input size—not the smart scaling, cropping and resizing that Vision does when working with MLModels.It can't handle flexible model input sizes. Even when specifying allowed ranges for model input dimensions, the filter will always scale the input to the model's designated input size.As an alternative I wrote my own CIImageProcessorKernel with just a few lines of code that wraps a Core ML model and does the same without the issues mentioned above.I wish Apple would provide any documentation on how to use CICoreMLModelFilter properly.
Post not yet marked as solved
I'm using Core ML models for image style transfer. An initialized model takes ~60 MB memory on an iPhone X in iOS 12. However, the same model loaded on an iPhone Xs (Max) consumers more then 700 MB of RAM.In instruments I can see that the runtime allocates 38 IOSurfaces with up to 54 MB memory foodprint each alongside numerious other Core ML (Espresso) related objects. Those are not there on the iPhone X.My guess is that the Core ML runtime does something different in order to utilize the power of the A12. However, my app crashes due to the memory pressure.I already tried to convert my models again with the newest version of coremltools. However, they are identical.Did I miss something?Thanks in advance!