Render advanced 3D graphics and perform data-parallel computations using graphics processors using Metal.

Metal Documentation

Post

Replies

Boosts

Views

Activity

CAMetalLayer VS AVSampleBufferDisplayLayer ( gpu usage, performance, ...)
I am a VOIP app developer. I am planning to develop a VOIP app on iOS using WebRTC that operates in PiP (Picture-in-Picture) mode. Since MTKView (CAMetalLayer) cannot be used in PiP mode, I am considering using AVSampleBufferDisplayLayer. Regarding this, I am curious about the performance differences between CAMetalLayer and AVSampleBufferDisplayLayer. As far as I know, CAMetalLayer utilizes the GPU. Does AVSampleBufferDisplayLayer also render using the GPU? If AVSampleBufferDisplayLayer renders using the GPU, will the rendering performance be similar? => Based on tests, there seems to be no difference in CPU usage between the two, which leads me to speculate that AVSampleBufferDisplayLayer also uses the GPU. If both use the GPU and there are no performance differences, is there a significant advantage to using CAMetalLayer? Thank you in advance.
1
0
329
May ’24
SwiftUI full screen animation uses less energy than Metal Game template
I've got a full-screen animation of a bunch of circles filled with gradients, with plenty of (careless) overdraw, plus real-time audio processing driving the animation, plus the overhead of SwiftUI's dependency analysis, and that app uses less energy (on iPhone 13) than the Xcode "Metal Game" template which is a rotating textured cube (a trivial GPU workload). Why is that? How can I investigate further? Does CoreAnimation have access to a compositor fast-path that a Metal app cannot access? Maybe another data point: when I do the same circles animation using SwiftUI's Canvas, the energy use is "Very High" and GPU utilization is also quite high. Eventually the phone's thermal state goes "Serious" and I get a message on the device that "Charging will resume when iPhone returns to normal temperature".
0
4
469
May ’24
How do I properly set tagged color data in MTKView and CIContext?
I have provided a test UIKit app which displays three different images, side by side, each inside a separate MTKView. Each image is tagged with a different color profile: Display P3 uRGB Test RGB (from an image supplied in Apple's ImageApp sample). I set up default values for all color spaces and formats. I then check if the image is tagged and, if so, I override those values with state from the tagged color space. The variables I am setting: “workingColorSpace” in the Metal CIContext, default = sRGB “workingFormat” in the Metal CIContext, default = RGBAf “outputColorSpace” in the Metal CIContext, default = displayP3 “colorPixelFormat” in the MTKView, default = bgra8Unorm “colorSpace” in a CIRenderDestination that I use in the MTKView delegate draw method The “colorSpace” default value = CGColorSpaceCreateDeviceRGB() I also set “pixelFormat” in CIRenderDestination with the MTKView.colorPixelFormat. If the image is tagged, I override the following values with the tagged colorSpace: CIContext.workingColorSpace CIContext.outputColorSpace CIRenderDestination.colorSpace If the tagged colorSpace.isWideGamutRGB = true, then I set the CIRenderDestination.colorSpace to extendedSRGB, ignoring the color space in the tagged wide gamut color space, as well as set the colorPixelFormat = bgr10_xr Results: The above scenario will properly render the DisplayP3 image, and the uRGB image. The “Test RGB” image fails: If I do not override the CIRenderDestination.colorSpace with a value from the tagged image, then the “Test RGB” image succeeds, but the “uRGB” image fails to render properly: Question: Do I have everything hooked up correctly and, if so, why does one image fail, and the other succeed? Link to sample project: https://www.dropbox.com/scl/fi/57u2fcrgdvys7jtzykzxt/ColorSpaceTest.zip?rlkey=unjeeiu7mi0wx9wfpylt78nwd&dl=0
2
0
490
Mar ’24
FxPlug outputTexture have wrong usage.
After the build 4.2.9. I have a weird bug. It keep crashing and when I read the message, it display validateRenderPassDescriptor:782: failed assertion `RenderPass Descriptor Validation Texture at colorAttachment[0] has usage (0x01) which doesn't specify MTLTextureUsageRenderTarget (0x04) This happen when I run in debug mode and try to hook up the motion template. I found out that the output texture create have usage only "MTLTextureUsageShaderRead" but no "MTLTextureUsageRenderTarget" Anyone have problem like me? I uusing fxplug 4.2.9 motion 5.7 and final cut 10.7.1. running in sonoma 14.2.1
0
0
286
Mar ’24
Debug symbols in metallib
Hello, I’ve started testing the Metal Shader Converter to convert my HLSL shaders to metallib directly, and I was wondering if the option ’-frecord-sources’ was supported in any way? Usually I’m compiling my shaders as follows (from Metal): xcrun -sdk macosx metal -c -frecord-sources shaders/shaders.metal -o shaders/shaders.air xcrun -sdk macosx metallib shaders/shaders.air -o shaders/shaders.metallib The -frecord-sources allow me to see the source when debugging and profiling a Metal frame. Now with DXC we have a similar option, I can compile a typical HLSL shader with embedded debug symbols with: dxc -T vs_6_0 -E VSMain shaders/triangle.hlsl -Fo shaders/triangle.dxil -Zi -O0 -Qembed_debug The important options here are ’-Zi` and ’-Qembed_debug’, as they make sure debug symbols are embedded in the DXIL. It seems that right now Metal Shader Converter doesn’t pass through the DXIL debug information, and I was wondering if it was possible. I’ve looked at all the options in the utility and haven’t seen anything that looked like it. Right now debug symbols in my shaders is a must-have, so I’ll explore other routes to convert my HLSL shaders to Metal (I’ve been testing spir-v cross to do the conversion, I haven’t actually tested the debug symbols yet, I’ll report back later). Thank you for your time!
3
0
1.1k
Jun ’23
Question about statistics of shader per line profile in xcode
Hi, I am using xcode frame capture to profile my app's shader. And I got some question about the shader per line profile statistics. Please see the two screen shot first, it is my compute shader. Begin: End: The first image is the head of the shader. The profile show's that the shader entry function takes 72.44% of the time. And at the end of the shader, the profile shows that the right brace '}' takes 60.45%. Here is my question: How to properly understand the profile data? What's the real performance data of this shader? Why the shader entry function does not take 100% of the time? Can someone help me to answer the question? Thanks! Boson
0
0
429
Dec ’23
AVSampleBufferDisplayLayer vs CAMetalLayer for displaying HDR
I have been using MTKView to display CVPixelBuffer from the camera. I use so many options to configure color space of the MTKView/CAMetalLayer that may be needed to tonemap content to the display (CAEDRMetadata for instance). If however I use AVSampleBufferDisplayLayer, there are not many configuration options for color matching. I believe AVSampleBufferDisplayLayer uses pixel buffer attachments to determine the native color space of the input image and does the tone mapping automatically. Does AVSampleBufferDisplayLayer have any limitations compared to MTKView, or both can be used without any compromise on functionality?
0
0
512
Nov ’23
Frames out of order using AVAssetWriter
We are using AVAssetWriter to write videos using both HEVC and H.264 encoding. Occasionally, we get reports of choppy footage in which frames appear out of order when played back on a Mac (QuickTime) or iOS device (stock Photos app). This occurs extremely unpredictably, often not starting until 20+ minutes of filming, but occasionally happening as soon as filming starts. Interestingly, users have reported the issue goes away while editing or viewing on a different platform (e.g. Linux) or in the built-in Google Drive player, but comes back as soon as the video is exported or downloaded again. When this occurs in an HEVC file, converting to H.264 seems to resolve it. I haven't found a similar fix for H.264 files. I suspect an AVAssetWriter encoding issue but haven't been able to uncover the source. Running a stream analyzer on HEVC files with this issue reveals the following error: Short-term reference picture with POC = [some number] seems to have been removed or not correctly decoded. However, running a stream analyzer on H.264 files with the same playback issue seems to show nothing wrong. At a high level, our video pipeline looks something like this: Grab a sample buffer in captureOutput(_ captureOutput: AVCaptureOutput!, didOutputVideoSampleBuffer sampleBuffer: CMSampleBuffer!) Perform some Metal rendering on that buffer Pass the resulting CVPixelBuffer to the AVAssetWriterInputPixelBufferAdaptor associated with our AVAssetWriter Example files can be found here: https://drive.google.com/drive/folders/1OjDZ3XaC-ubD5hyDiNvMQGl2NVqZbWnR?usp=sharing This includes a video file suffering this issue, the same file fixed after converting to mp4, and a screen recording of the distorted playback in QuickTime. Can anyone help point me in the right direction to solving this issue? I can provide more details as necessary.
4
2
1.2k
May ’23
Correctly process HDR in Metal Core Image Kernels (& Metal)
I am trying to carefully process HDR pixel buffers (10-bit YCbCr buffers) from the camera. I have watched all WWDC videos on this topic but have some doubts expressed below. Q. What assumptions are safe to make about sample values in Metal Core Image Kernels? Are the sample values received in Metal Core Image kernel linear or gamma corrected? Or does that depend on workingColorSpace property, or the input image that is supplied (though imageByMatchingToColorSpace() API, etc.)? And what could be the max and min values of these samples in either case? I see that setting workingColorSpace to NSNull() in context creation options will guarantee receiving the samples as is and normalised to [0-1]. But then it's possible the values are non-linear gamma corrected, and extracting linear values would involve writing conversion functions in the shader. In short, how do you safely process HDR pixel buffers received from the camera (which are in YCrCr420_10bit, which I believe have gamma correction applied, so Y in YCbCr is actually Y'. Can AVFoundation team clarify this?) ?
0
0
618
Nov ’23
MPSMatrix wastes time calling getenv() over and over
I Instrument's CPU Profiling tool I've noticed that a significant portion (22.5%) of the CPU-side overhead related to MPS matrix multiplication (GEMM) is in a call to getenv(). Please see attached screenshot. It seems unnecessary to perform this same check over and over, as whatever hack that needs this should be able to perform the getenv() only once and cache the result for future use.
2
0
774
Nov ’23
Is transparency supported in MetalFX?
Hi, Is transparency supported in MetalFX? I have a project that sets a texture to a particular alpha value. It works fine. However, as soon as I enable MetalFX, the transparency stops working. The alpha value is set to 1.0. If transparency is supported in MetalFX, how do I enable it? Thank you
0
1
405
Oct ’23
Vision OS and coloring entities using Metal Shaders
I'm experimenting with Vision OS and Apple Vision Pro using the Xcode Beta. I'm using Xcode 15.1 Beta and visionOS 1.0 beta 4. I'm currently doing a project where I draw a polygon using a mesh generated from MeshDescriptor/MeshResource and present it in an ImmersiveView. I want to change the color of parts, i.e. not all of, my 3D rendered polygon and I want to do it dynamically. For example when the user presses a button. I have gotten into Shaders and the CustomMaterial from RealityKit, only to find out that CustomMaterial is not supported on Vision OS! Does anyone know how I can color portions/parts of a mesh that is generated from MeshDescriptor and MeshResource?
2
0
976
Oct ’23
Metal Core Image passing sampler arguments
I am trying to use a CIColorKernel or CIBlendKernel with sampler arguments but the program crashes. Here is my shader code which compiles successfully. extern "C" float4 wipeLinear(coreimage::sampler t1, coreimage::sampler t2, float time) { float2 coord1 = t1.coord(); float2 coord2 = t2.coord(); float4 innerRect = t2.extent(); float minX = innerRect.x + time*innerRect.z; float minY = innerRect.y + time*innerRect.w; float cropWidth = (1 - time) * innerRect.w; float cropHeight = (1 - time) * innerRect.z; float4 s1 = t1.sample(coord1); float4 s2 = t2.sample(coord2); if ( coord1.x > minX && coord1.x < minX + cropWidth && coord1.y > minY && coord1.y <= minY + cropHeight) { return s1; } else { return s2; } } And it crashes on initialization. class CIWipeRenderer: CIFilter { var backgroundImage:CIImage? var foregroundImage:CIImage? var inputTime: Float = 0.0 static var kernel:CIColorKernel = { () -> CIColorKernel in let url = Bundle.main.url(forResource: "AppCIKernels", withExtension: "ci.metallib")! let data = try! Data(contentsOf: url) return try! CIColorKernel(functionName: "wipeLinear", fromMetalLibraryData: data) //Crashes here!!!! }() override var outputImage: CIImage? { guard let backgroundImage = backgroundImage else { return nil } guard let foregroundImage = foregroundImage else { return nil } return CIWipeRenderer.kernel.apply(extent: backgroundImage.extent, arguments: [backgroundImage, foregroundImage, inputTime]) } } It crashes in the try line with the following error: Fatal error: 'try!' expression unexpectedly raised an error: Foundation._GenericObjCError.nilError If I replace the kernel code with the following, it works like a charm: extern "C" float4 wipeLinear(coreimage::sample_t s1, coreimage::sample_t s2, float time) { return mix(s1, s2, time); }
1
0
1.3k
Nov ’21
MPS Graph Neural Network Training Produces NaN Loss on Xcode 15.0 beta 8 + iOS 17.0
Hello, I've been working on an app that involves training a neural network model on the iPhone. I've been using the Metal Performance Shaders Graph (MPS Graph) for this purpose. In the training process the loss becomes Nan on iOS17 (21A329). I noticed that the official sample code for Training a Neural Network using MPS Graph (link) works perfectly fine on Xcode 14.3.1 with iOS 16.6.1. However, when I run the same code on Xcode 15.0 beta 8 with iOS 17.0 (21A329), the training process produces a NaN loss in function updateProgressCubeAndLoss. The official sample code and my own app exhibit the same issue. Has anyone else experienced this issue? Is this a known bug, or is there something specific that needs to be adjusted for iOS 17? Any guidance would be greatly appreciated. Thank you!
1
0
702
Sep ’23
Can I run CatBoost/XGBoost on my GPU(s) on my Mac?
I'm interested in using CatBoost and XGBoost for some machine learning projects on my Mac, and I was wondering if it's possible to run these algorithms on my GPU(s) to speed up training times. I have a Mac with an AMD Radeon Pro 5600M and an Intel UHD Graphics 630 GPUs, and I'm running macOS Ventura 13.2.1. I've read that both CatBoost and XGBoost support GPU acceleration, but I'm not sure if this is possible on my system. Can anyone point me in the right direction for getting started with GPU-accelerated CatBoost/XGBoost on macOS? Are there any specific drivers or tools I need to install, or any other considerations I should be aware of? Thank you.
1
0
2k
Apr ’23
MPSMatrixDecompositionCholesky Status code
Hi, I am trying to extend the pytorch library. I would like to add MPS native Cholesky Decomposition. I finally got it working (mostly). But I am struggling to implement the status codes. What I did: // init status id<MTLBuffer> status = [device newBufferWithLength:sizeof(int) options:MTLResourceStorageModeShared]; if (status) { int* statusPtr = (int*)[status contents]; *statusPtr = 42; // Set the initial content to 42 NSLog(@"Status Value: %d", *statusPtr); } else { NSLog(@"Failed to allocate status buffer"); } ... [commandBuffer addCompletedHandler:^(id<MTLCommandBuffer> commandBuffer) { // Your completion code here int* statusPtr = (int*)[status contents]; int statusVal = *statusPtr; NSLog(@"Status Value: %d", statusVal); // Update the 'info' tensor here based on statusVal // ... }]; for (const auto i : c10::irange(batchSize)) { ... [filter encodeToCommandBuffer:commandBuffer sourceMatrix:sourceMatrix resultMatrix:solutionMatrix status:status]; } (full code here: https://github.com/pytorch/pytorch/blob/ab6a550f35be0fdbb58b06ff8bfda1ab0cc236d0/aten/src/ATen/native/mps/operations/LinearAlgebra.mm) But this code prints the following when input with a non positive definite tensor: 2023-09-02 19:06:24.167 python[11777:2982717] Status Value: 42 2023-09-02 19:06:24.182 python[11777:2982778] Status Value: 0 initial tensor: tensor([[-0.0516, 0.7090, 0.9474], [ 0.8520, 0.3647, -1.5575], [ 0.5346, -0.3149, 1.9950]], device='mps:0') L: tensor([[-0.0516, 0.0000, 0.0000], [ 0.8520, -0.3612, 0.0000], [ 0.5346, -0.3149, 1.2689]], device='mps:0') What am I doing wrong? Why do I get a 0 (success) status even tough the matrix is not positive definite. Thank you in advance!
0
0
548
Sep ’23
Meshlet
Hello. I'm working with Metal in Apple Vision Pro, and I've assumed that I can use Mesh shaders to work with Meshlets. But when creating the RenderPipeline, I get the following error message: "device does not support mesh shaders". The test is on the simulator, and my question is: Will Apple Vision Pro support Mesh shaders on physical devices? Thanks.
1
0
609
Aug ’23