Render advanced 3D graphics and perform data-parallel computations using graphics processors using Metal.

Metal Documentation

Posts under Metal subtopic

Post

Replies

Boosts

Views

Activity

Can't profile Metal on Apple TV
Hi, I can capture a frame on the Apple TV, but when I try to profile the capture for GPU timing information, I got "Abort Trap 6" error and with following error in the report: Exception Type: EXC_CRASH (SIGABRT) Exception Codes: 0x0000000000000000, 0x0000000000000000 Triggered by Thread: 7 Application Specific Information: abort() called Last Exception Backtrace: 0 CoreFoundation 0x18c0a99d0 __exceptionPreprocess + 160 1 libobjc.A.dylib 0x18b596d24 objc_exception_throw + 71 2 CoreFoundation 0x18bfa7308 -[__NSArrayM insertObject:atIndex:] + 1239 3 MTLReplayController 0x101f5d148 DYMTLReplayFrameProfiler_loadAnalysis + 1140 4 MTLReplayController 0x101e97f90 GTMTLReplayClient_collectGPUShaderTimelineData + 224 5 MTLReplayController 0x101e81794 __30-[GTMTLReplayService profile:]_block_invoke_4 + 288 6 Foundation 0x18eb6072c __NSOPERATION_IS_INVOKING_MAIN__ + 11 7 Foundation 0x18eb5cc1c -[NSOperation start] + 623 8 Foundation 0x18eb60edc __NSOPERATIONQUEUE_IS_STARTING_AN_OPERATION__ + 11 9 Foundation 0x18eb60bc4 __NSOQSchedule_f + 167 10 libdispatch.dylib 0x18b8d6a84 _dispatch_block_async_invoke2 + 103 11 libdispatch.dylib 0x18b8c9420 _dispatch_client_callout + 15 12 libdispatch.dylib 0x18b8cc5d0 _dispatch_continuation_pop + 531 13 libdispatch.dylib 0x18b8cbcd4 _dispatch_async_redirect_invoke + 635 14 libdispatch.dylib 0x18b8d9224 _dispatch_root_queue_drain + 335 15 libdispatch.dylib 0x18b8d9a08 _dispatch_worker_thread2 + 163 16 libsystem_pthread.dylib 0x18b6e652c _pthread_wqthread + 223 17 libsystem_pthread.dylib 0x18b6ed8d0 start_wqthread + 7 It's Xcode 16.0 + Apply TV 4K (4th Gen) tvOS 18, does anyone know what's the cause of this error and is there any solution for it? Thank you very much, Kai
0
0
335
Oct ’24
Getting stuck in first frame of renderLoop.
First I get this ar_world_tracking_provider_query_device_anchor_at_timestamp <0x302b9c0a0>: The device_anchor can only be queried when the world tracking provider is running. This seemed to all break with the auto-update to 2.0.1. Simulator runs the code fine. I seem to see an infinite stall here frameLayer.endUpdate() // Pace frames by waiting for the optimal prediction time. try await LayerRenderer.Clock().sleep(until: timing.optimalInputTime, tolerance: nil) // Start submitting the updated frame. frameLayer.startSubmission() <-
0
0
504
Oct ’24
Strange Metal related shader issue
Hi everyone, I encountered a very strange shader bug that seems related to Metal only (not OpenGL). You can find the full description of the issue on the Babylon.js forums here: https://forum.babylonjs.com/t/strange-shader-related-issue-on-macos-with-safari-and-chrome-not-firefox/54289 (sorry, I couldn't post a clickable link here as this seems to be blocked here). I have a workaround to fix the issue (as described in the link above), but this really looks like an issue in Metal itself. Let me know if you need more details or explanations.
0
0
397
Oct ’24
Metal rendering shows incorrect color, gputrace shows fragment function returning correct color
I'm experiencing a strange issue where I'm seeing black in a metal drawable where it should be a different color. When I capture the frame and inspect the returned value from the fragment function, it's correct, but the drawable isn't. This screenshot hopefully illustrates the issue. I've not found any references to similar issues. I saw something about some out of bounds or NaN values being dropped to 0 (which would be black), but the debugger doesn't indicate this is happening.
1
0
474
Oct ’24
Metal Inline Functions
Hi! How to define and call an inline function in Metal? Or simple function that will return some value. Case: inline uint index4D(constant _4D& shape, constant uint& n, constant uint& c, constant uint& h, constant uint& w) { return n * shape.C * shape.H * shape.W + c * shape.H * shape.W + h * shape.W + w; } When I call it in my kernel function I get No matching function for call error. Thx in advance.
2
0
714
Nov ’24
Metal Compute Overhead
Hello, We are experimenting with Metal to accelerate some peculiar numerical computation. Our workloads are relatively small, so the ability to avoid moving data to and from the GPU's memory is very appealing. However, we are observing higher overhead compared to CUDA, which negates the benefits of avoiding data transfer. In our tests using an empty kernel, CUDA completes in 0.001 ms (Intel i7 10700K, RTX 3080), while Metal's waitUntilCompleted takes 0.12 ms (M2 Max). As we do not have prior experience with Metal, we are wondering if we are using the APIs just fine and this timing is expected, or if there is a way to reduce it. Thank you in advance for any comment! test-metal.cpp
0
0
629
Nov ’24
Cannot use Metal graphics overview HUD with multiple CAMetalLayers
I have multiple CAMetalLayers that I render content to and noticed that the graphics overview HUD does not function properly when I have more than one CAMetalLayer. The values reported will be very strange. For example, FPS may report 999 or some large negative value. It the HUD simply not designed to work with multiple CAMetalLayers or MTKViews? When I disable all but one of my CAMetalLayers, the HUD works as expected.
1
0
688
Nov ’24
What does CAMetalLayerWantsCompositingDependencies in Info.plist do?
I've noticed a major third-party app has the following flag set to 1/true in its Info.plist: CAMetalLayerWantsCompositingDependencies Does anyone know if it’s recognized by Core Animation / Metal, and what it’s supposed to do? It might obviously have zero relationship to the OS, defined by that app and for that app... but since it looks very much like an unofficial/undocumented environment setting, it might be great to know what problem it solves. I happen to have issues related to compositing other CALayers over a CAMetalLayer in my app... so this definitely stood out as interesting. Thank you!
0
0
425
Nov ’24
Rendering YCbCr input using Metal
I would like to take YCbCr CVPixelBuffers from AVCaptureVideoDataOutput, apply some processing in RGB space, render to an MTKView, and pass to AVAssetWriter for recording. Right now, I'm doing this all manually – deswing the incoming data if necessary, choose the right matrix to convert to RGB, apply processing, etc. I also have to convert back to YCbCr before feeding the frames to AVAssetWriter because encoding performs much better if I do. Is there any efficient, built-in way to achieve the same? I can't use AVCaptureVideoPreviewLayer, since I need to do some further processing before display. I can't use AVCaptureVideoDataOutput's videoSettings to get automatic BGRA conversion because that would lose bit depth for 10 bit video formats (and isn't available on all formats anyway). I see these Accelerate functions, but they seemingly don't use the GPU, nor do they support all the formats and bit depths I'd need. I found reference to some undocumented MTLPixelFormats that seem to do exactly what I want, but I don't want to rely on something like this unless it's explicitly endorsed. This would also incur an RGB/YCbCr conversion on every texture read and write, right? Is there anything I'm missing here?
0
0
542
Nov ’24
D3DMetal unsupported CheckFeatureSupport query 53 while running simple vulkaninfo using Mesa 24.3 Dozen (Vulkanon12) driver..
Hi, wanted to test if possible to use Mesa3D Dozen driver(Vulkan on D3D12 )+D3DMetal 2b3 to get maybe better Vulkan driver on Wine than default MoltenVK.. this will support Vulkan windows apps via using D3D12Metal.. using vulkan_dzn.dll,dzn_icd.x86_64.json,dxil.dll from x64 folder from: https://github.com/pal1000/mesa-dist-win/releases/download/24.3.0-rc1/mesa3d-24.3.0-rc1-release-msvc.7z using simple vulkaninfo app and running like: wine64 vulkaninfo I get error: [D3DMetal:LOG:2A825] Unsupported API: CheckFeatureSupport, unhandled support query 53 also seems D3DMetal Wine integration on Whisky doesn't expose d3d12core.dll and d3d12.dll like new Agility D3D12 dlls or VKD3D, so getting: MESA: error: Failed to retrieve D3D12GetInterface MESA: error: Failed to load DXCore but anyways seems to try to load the driver as: WARNING: dzn is not a conformant Vulkan implementation, testing use only. full log: MESA: error: Failed to retrieve D3D12GetInterface MESA: error: Failed to load DXCore WARNING: dzn is not a conformant Vulkan implementation, testing use only. [D3DMetal:LOG:2A825] Unsupported API: CheckFeatureSupport, unhandled support query 53 00bc:fixme:dcomp:DCompositionCreateDevice 0000000000000000, {c37ea93a-e7aa-450d-b16f-9746cb0407f3}, 000000000011E328. MESA: error: Failed to load DXCore WARNING: dzn is not a conformant Vulkan implementation, testing use only. [D3DMetal:LOG:2A825] Unsupported API: CheckFeatureSupport, unhandled support query 53 00bc:fixme:dcomp:DCompositionCreateDevice 0000000000000000, {c37ea93a-e7aa-450d-b16f-9746cb0407f3}, 000000000011E578. ERROR: [Loader Message] Code 0 : setup_loader_term_phys_devs: Call to 'vkEnumeratePhysicalDevices' in ICD c:\windows\system32\.\vulkan_dzn.dll failed with error code -3 ERROR: [Loader Message] Code 0 : setup_loader_term_phys_devs: Failed to detect any valid GPUs in the current config ERROR at C:\j\msdk0\build\Khronos-Tools\repo\vulkaninfo\vulkaninfo.h:241:vkEnumeratePhysicalDevices failed with ERROR_INITIALIZATION_FAILED
0
0
776
Nov ’24
MTKView draw method causes EXC_BAD_ACCESS crash
Hello, I am using MTKView to display: camera preview & video playback. I am testing on iPhone 16. App crashes at a random moment whenever MTKView is rendering CIImage. MetalView: public enum MetalActionType { case image(CIImage) case buffer(CVPixelBuffer) } public struct MetalView: UIViewRepresentable { let mtkView = MTKView() public let actionPublisher: any Publisher<MetalActionType, Never> public func makeCoordinator() -> Coordinator { Coordinator(self) } public func makeUIView(context: UIViewRepresentableContext<MetalView>) -> MTKView { guard let metalDevice = MTLCreateSystemDefaultDevice() else { return mtkView } mtkView.device = metalDevice mtkView.framebufferOnly = false mtkView.clearColor = MTLClearColor(red: 0, green: 0, blue: 0, alpha: 0) mtkView.drawableSize = mtkView.frame.size mtkView.delegate = context.coordinator mtkView.isPaused = true mtkView.enableSetNeedsDisplay = true mtkView.preferredFramesPerSecond = 60 context.coordinator.ciContext = CIContext( mtlDevice: metalDevice, options: [.priorityRequestLow: true, .highQualityDownsample: false]) context.coordinator.metalCommandQueue = metalDevice.makeCommandQueue() context.coordinator.actionSubscriber = actionPublisher.sink { type in switch type { case .buffer(let pixelBuffer): context.coordinator.updateCIImage(pixelBuffer) break case .image(let image): context.coordinator.updateCIImage(image) break } } return mtkView } public func updateUIView(_ nsView: MTKView, context: UIViewRepresentableContext<MetalView>) { } public class Coordinator: NSObject, MTKViewDelegate { var parent: MetalView var metalCommandQueue: MTLCommandQueue! var ciContext: CIContext! private var image: CIImage? { didSet { Task { @MainActor in self.parent.mtkView.setNeedsDisplay() //<--- call Draw method } } } var actionSubscriber: (any Combine.Cancellable)? private let operationQueue = OperationQueue() init(_ parent: MetalView) { self.parent = parent operationQueue.qualityOfService = .background super.init() } public func mtkView(_ view: MTKView, drawableSizeWillChange size: CGSize) { } public func draw(in view: MTKView) { guard let drawable = view.currentDrawable, let ciImage = image, let commandBuffer = metalCommandQueue.makeCommandBuffer(), let ci = ciContext else { return } //making sure nothing is nil, now we can add the current frame to the operationQueue for processing operationQueue.addOperation( MetalOperation( drawable: drawable, drawableSize: view.drawableSize, ciImage: ciImage, commandBuffer: commandBuffer, pixelFormat: view.colorPixelFormat, ciContext: ci)) } //consumed by Subscriber func updateCIImage(_ img: CIImage) { image = img } //consumed by Subscriber func updateCIImage(_ buffer: CVPixelBuffer) { image = CIImage(cvPixelBuffer: buffer) } } } now the MetalOperation class: private class MetalOperation: Operation, @unchecked Sendable { let drawable: CAMetalDrawable let drawableSize: CGSize let ciImage: CIImage let commandBuffer: MTLCommandBuffer let pixelFormat: MTLPixelFormat let ciContext: CIContext init( drawable: CAMetalDrawable, drawableSize: CGSize, ciImage: CIImage, commandBuffer: MTLCommandBuffer, pixelFormat: MTLPixelFormat, ciContext: CIContext ) { self.drawable = drawable self.drawableSize = drawableSize self.ciImage = ciImage self.commandBuffer = commandBuffer self.pixelFormat = pixelFormat self.ciContext = ciContext } override func main() { let width = Int(drawableSize.width) let height = Int(drawableSize.height) let ciWidth = Int(ciImage.extent.width) //<-- Thread 22: EXC_BAD_ACCESS (code=1, address=0x5e71f5490) A bad access to memory terminated the process. let ciHeight = Int(ciImage.extent.height) let destination = CIRenderDestination( width: width, height: height, pixelFormat: pixelFormat, commandBuffer: commandBuffer, mtlTextureProvider: { [self] () -> MTLTexture in return drawable.texture }) let transform = CGAffineTransform( scaleX: CGFloat(width) / CGFloat(ciWidth), y: CGFloat(height) / CGFloat(ciHeight)) do { try ciContext.startTask(toClear: destination) try ciContext.startTask(toRender: ciImage.transformed(by: transform), to: destination) } catch { } commandBuffer.present(drawable) commandBuffer.commit() commandBuffer.waitUntilCompleted() } } Now I am no Metal expert, but I believe it's a very simple execution that shouldn't cause memory leak especially after we have already checked for whether CIImage is nil or not. I have also tried running this code without OperationQueue and also tried with @autoreleasepool but none of them has solved this problem. Am I missing something?
1
0
737
Dec ’24
Jurassic World Evolution 2 Likely Fails Due to Missing Tiled Resources Support
I’ve been trying to run Jurassic World Evolution 2 using the Game Porting Toolkit on macOS, but the game doesn’t launch and crashes immediately. Based on the error and research, it seems the issue is related to missing support for D3D12_TILED_RESOURCES_TIER_2 in the Metal API. If this is the case, does anyone know if support for tiled resources is planned for future updates of the toolkit? Or are there any potential workarounds for bypassing this limitation?
1
0
695
Dec ’24
How to use MTKTextureLoader to load png data
I am trying to load some PNG data with MTKTextureLoader newTextureWithData,but the result shows wrong at the alpha area. Here is the code. I have an image URL, after it downloads successfully, I try to use the data or UIImagePNGRepresentation (image), they all show wrong. UIImage *tempImg = [UIImage imageWithData:data]; CGImageRef cgRef = tempImg.CGImage; MTKTextureLoader *loader = [[MTKTextureLoader alloc] initWithDevice:device]; id<MTLTexture> temp1 = [loader newTextureWithData:data options:@{MTKTextureLoaderOptionSRGB: @(NO), MTKTextureLoaderOptionTextureUsage: @(MTLTextureUsageShaderRead), MTKTextureLoaderOptionTextureCPUCacheMode: @(MTLCPUCacheModeWriteCombined)} error:nil]; NSData *tempData = UIImagePNGRepresentation(tempImg); id<MTLTexture> temp2 = [loader newTextureWithData:tempData options:@{MTKTextureLoaderOptionSRGB: @(NO), MTKTextureLoaderOptionTextureUsage: @(MTLTextureUsageShaderRead), MTKTextureLoaderOptionTextureCPUCacheMode: @(MTLCPUCacheModeWriteCombined)} error:nil]; id<MTLTexture> temp3 = [loader newTextureWithCGImage:cgRef options:@{MTKTextureLoaderOptionSRGB: @(NO), MTKTextureLoaderOptionTextureUsage: @(MTLTextureUsageShaderRead), MTKTextureLoaderOptionTextureCPUCacheMode: @(MTLCPUCacheModeWriteCombined)} error:nil]; }] resume];
5
0
591
May ’25
What are the CAMetalLayer.nextDrawable threading rules?
What evidence exists that it's safe to call nextDrawable() on CAMetalLayer off the main thread? I have seen developers claiming that it's OK, but the official docs are silent on the topic. Attempting to do so with Strict Concurrency Checking set to Complete complains that CAMetalLayer is not @Sendable. I want to call it off the main thread since there doesn't seem to be any way to prevent it from blocking the UI for up to a second. I have read hints and allegations that this won't happen if you avoid asking for too many drawables, but that doesn't seem to be true 100% of the time in my experience. Supposing it is allowed, I wonder how races are handled such as when the layer's size is changed on the main thread, or if the layer is removed from the layer hierarchy.
0
0
498
Dec ’24
Texture Definitions for MPSSVGF Denoise
I am trying to use the SVGF denoiser to denoise my ray traced shadows (and also other textures later). I do get a smoothed image, but with wonky denoising. I need the depth-normal textures and motion textures for the SVGF and assume that these are badly filled in my case. However, neither in the above linked documentation nor in the WWDC19 video I find how they should be defined. I am looking to answers to: Is depth in red or alpha channel for the depth-normal texture? Are the normals in screen space? Is depth linear? Is it distance or z coordinate in view space? Or even logarithmically scaled or something else? Are the motion vectors supposed to be in pixels per frame? What is the orientation of the axis? Is y up or down? Are there are other restrictions on the formats? Also the linked code did not help me (I have not found any SVGF so far; also all the code is in Objective-C++, not Swift, but that's a different topic). So how should I fill these textures. Can someone point me to the documentation where these kinds of questions are answered?
0
0
528
Dec ’24
Concurrent conflicting texture writes
Hello! I need to "draw" a set of particles into the texture. It would be trivial in render encoder of course. However, I would like to implement the task in compute kernel. Every particle draw operation is expected to set 5 texels - "center" one and left/right/upper/lower. Particles can and will overlap, so concurrent draws are to be expected. I tried using texture atomics - atomic_store() to be more precise. This worked, albeit pretty slowly - too slow for my purpose. Just to test what would happen, I tried using normal texture write(). I was expecting to see some kind of visual artefacts, but to my surprise, it worked very well (and much faster). My question: is it safe? I understand that calling write() doesn't guarantee any ordering of the operations, so if multiple threads write to the same texel, the final value may come from any of those threads. But suppose all the threads were to write the very same color? Can I assume that the texel in question will have said color after the compute kernel finishes? I am using M2 Pro MacBook, but ideally I would love to get the answer for the all Apple Silicon devices. My texture format is R32Int (so as to be able to use atomics), but I could do with any single-channel format, the purpose of the texture is to be binary mask of sorts. Thanks!
0
0
387
Feb ’25