Render advanced 3D graphics and perform data-parallel computations using graphics processors using Metal.

Metal Documentation

Posts under Metal subtopic

Post

Replies

Boosts

Views

Activity

MTLCaptureManager.sharedCaptureManager generates corrupted .gputrace files (0KB, invalid internal structure)
Hello, I am experiencing an issue with programmatically capturing a GPU trace using MTLCaptureManager. The .gputrace file that is generated appears to be corrupted, and I'm looking for guidance or a solution. Description of the Problem: I am using MTLCaptureManager.sharedCaptureManager to capture a Metal frame and save it to disk. The generated .gputrace file is consistently reported as 0 bytes in size by the file system. Crucially, when I compress this 0-byte .gputrace file into a .zip archive, the resulting archive contains the full, expected data. After unzipping, the file can be opened and viewed correctly in Xcode. However,When inspecting the file's contents using NSFileManager in Objective-C (treating it as a directory), the internal structure is different from a .gputrace file captured directly from Xcode's Metal Debugger. capture in xcode capture in file Finally,When capturing multiple frames programmatically, the first captured frame contains valid buffer data. However, for subsequent frames (starting from the second frame), the corresponding buffer contents are all zero-filled. Frame 1: All MTLBuffer data is correctly captured and populated. Frame 2 and onward: The same MTLBuffer objects are present in the trace, but their contents are entirely 0 (i.e., the data is not captured or is corrupted). In this case, the on-screen display is normal, but the captured frame is incorrect. The frame captured directly in Xcode is also correct. Only the frame captured to a file is abnormal.
1
0
457
Aug ’25
Float64 (Double Precision) Support on MPS with PyTorch on Apple Silicon?
Hi everyone, This project uses PyTorch on an Apple Silicon Mac (M1/M2/etc.), and the goal is to use the MPS backend for GPU acceleration, notes Apple Developer. However, the workflow depends on Float64 (double-precision) floating-point numbers for certain computations, notes PyTorch Forums. The error "Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead" has been encountered, notes GitHub. It seems that the MPS backend doesn't currently support Float64 for direct GPU computation. Questions for the community: Are there any known workarounds or best practices for handling Float64-dependent operations when using the MPS backend with PyTorch? For those working with high-precision tasks on Apple Silicon, what strategies are being used to balance performance with the need for Float64? Offloading to the CPU is an option, and it's of interest to know if there are any specific techniques or libraries within the Apple ecosystem that could streamline this process while aiming for optimal performance. Any insights, tips, or experiences would be appreciated. Thanks in advance, Jonaid MacBook Pro M3 Max
1
0
131
Sep ’25
How to use MetalPeformancePrimitives
I am trying to learn the new Metal Peformance Primitives APIs. I have added the MetalPeformancePrimitives framework and included the header in my shader code as per documentation #include <MetalPeformancePrimitives/MetalPeformancePrimitives.h> Unfortunately, Xcode complains that the header cannot be found. How do I include it properly? I am using Xcode 26 on Tahoe. The MetalPeformancePrimitives framework is present on my machine and I can inspect the headers in the filesystem.
1
1
325
22h
Metal fails to create PSO on AMD based GPUs
Hello, Shaders in our application is written using HLSL and we rely on Metal Shader Converter to convert DXIL to Metal IR. We ran into an issue that causes metal pipeline state creation to fail when vertex stage-in function is used on AMD GPUs. Here's the error reported by Metal in Xcode output: Compiler failed with XPC_ERROR_CONNECTION_INTERRUPTED XPC_ERROR_CONNECTION_INTERRUPTED MTLCompiler: Compilation failed with XPC_ERROR_CONNECTION_INTERRUPTED on 4 try. This error suggests an unexpected interruption in the connection. Possible reasons: a crash in the compiler service, termination by the OS due to resource constraints (e.g., jetsam), a timeout in the service, or an issue with IPC. Verify system stability and check the logs for more details. Compiler failed with XPC_ERROR_CONNECTION_INVALID XPC_ERROR_CONNECTION_INVALID MTLCompiler: Compiler encountered XPC_ERROR_CONNECTION_INVALID: failed to check-in, peer may have been unloaded: mach_error=10000003 (is the OS shutting down or process jetsammed?) Compilation failed due to an interrupted connection: XPC_ERROR_CONNECTION_INTERRUPTED. This error occurred after multiple retries. which seems to indicate a internal compiler error. I have a minimal repro here: https://github.com/kcloudy0717/metal_pso_fail/tree/main, simply follow the instructions in README.
1
0
66
3d
10-bit support in iPad Pro
Hi, I’m using the latest iPad Pro (13-inch) and I can see that Metal offers an rgb10a2unorm texture for rendering, but when I render a grey ramp and measure the actual luminance, I get a pattern that I would expect from an 8-bit texture (see below). Before I start ripping apart all my code, is there anything else I need to do to convince iOS to render my texture in 10-bit? I already tried setting the PixelFormat in my CMetalLayer to rgb10a2unorm, but that didn’t change anything.
1
0
326
3d
Metal UIView to transform what's behind it
I'm trying to create a custom Metal-based visual effect as a UIView to be used inside an existing UIKit-based interface. (An example might be a view that applies a blur effect to what's behind it.) I need to capture the MTLTexture of what's behind the view so that I can feed it to MTLRenderCommandEncoder.setFragmentTexture(_:index:). Can someone show me how or point me to an example? Thanks!
2
0
819
Oct ’24
Cannot Display MTKView on a sheeted view on macOS15
I use xcode16 and swiftUI for programming on a macos15 system. There is a problem. When I render a picture through mtkview, it is normal when displayed on a regular view. However, when the view is displayed through the .sheet method, the image cannot be displayed. There is no error message from xcode. import Foundation import MetalKit import SwiftUI struct CIImageDisplayView: NSViewRepresentable { typealias NSViewType = MTKView var ciImage: CIImage init(ciImage: CIImage) { self.ciImage = ciImage } func makeNSView(context: Context) -&gt; MTKView { let view = MTKView() view.delegate = context.coordinator view.preferredFramesPerSecond = 60 view.enableSetNeedsDisplay = true view.isPaused = true view.framebufferOnly = false if let defaultDevice = MTLCreateSystemDefaultDevice() { view.device = defaultDevice } view.delegate = context.coordinator return view } func updateNSView(_ nsView: MTKView, context: Context) { } func makeCoordinator() -&gt; RawDisplayRender { RawDisplayRender(ciImage: self.ciImage) } class RawDisplayRender: NSObject, MTKViewDelegate { // MARK: Metal resources var device: MTLDevice! var commandQueue: MTLCommandQueue! // MARK: Core Image resources var context: CIContext! var ciImage: CIImage init(ciImage: CIImage) { self.ciImage = ciImage self.device = MTLCreateSystemDefaultDevice() self.commandQueue = self.device.makeCommandQueue() self.context = CIContext(mtlDevice: self.device) } func mtkView(_ view: MTKView, drawableSizeWillChange size: CGSize) {} func draw(in view: MTKView) { guard let currentDrawable = view.currentDrawable, let commandBuffer = commandQueue.makeCommandBuffer() else { return } let dSize = view.drawableSize let drawImage = self.ciImage let destination = CIRenderDestination(width: Int(dSize.width), height: Int(dSize.height), pixelFormat: view.colorPixelFormat, commandBuffer: commandBuffer, mtlTextureProvider: { () -&gt; MTLTexture in return currentDrawable.texture }) _ = try? self.context.startTask(toClear: destination) _ = try? self.context.startTask(toRender: drawImage, from: drawImage.extent, to: destination, at: CGPoint(x: (dSize.width - drawImage.extent.width) / 2, y: 0)) commandBuffer.present(currentDrawable) commandBuffer.commit() } } } struct ShowCIImageView: View { let cii = CIImage.init(contentsOf: Bundle.main.url(forResource: "9-10", withExtension: "jpg")!)! var body: some View { CIImageDisplayView.init(ciImage: cii).frame(width: 500, height: 500).background(.red) } } struct ContentView: View { @State var showImage = false var body: some View { VStack { Image(systemName: "globe") .imageScale(.large) .foregroundStyle(.tint) Text("Hello, world!") ShowCIImageView() Button { showImage = true } label: { Text("showImage") } } .frame(width: 800, height: 800) .padding() .sheet(isPresented: $showImage) { ShowCIImageView() } } }
2
1
637
Oct ’24
Can't link metal-cpp to Modern Rendering With Metal sample
There is a sample project from Apple here. It has a scene of a city at night and you can move in it. It basically has 2 parts: application code written in what looks like Objective-C (I am more familiar with C++), which inherits from things like NSObject, MTKView, NSViewController and so on - it processes input and all app-related and window-related stuff. rendering code that also looks like Objective-C. Btw both parts are mostly in .mm files (Obj-C++ AFAIK). The application part directly uses only one class from the rendering part - AAPLRenderer. I want to move the rendering part to C++ using metal-cpp. For that I need to link metal-cpp to the project. I did it successfully with blank projects several times before using this tutorial. But with this sample project Xcode can't find Foundation/Foundation.hpp (and other metal-cpp headers). The error says this: Did not find header 'Foundation.hpp' in framework 'Foundation' (loaded from '/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX15.0.sdk/System/Library/Frameworks') Pls help
2
0
813
Feb ’25
Metal Inline Functions
Hi! How to define and call an inline function in Metal? Or simple function that will return some value. Case: inline uint index4D(constant _4D& shape, constant uint& n, constant uint& c, constant uint& h, constant uint& w) { return n * shape.C * shape.H * shape.W + c * shape.H * shape.W + h * shape.W + w; } When I call it in my kernel function I get No matching function for call error. Thx in advance.
2
0
714
Nov ’24
Dismissing a Window that contains MTKView no longer updates
I'm writing a swift app that uses metal to render textures to the main view. I currently use a NSViewRepresentable to place a MTKView into the window and a MTKViewDelegate to perform the metal operations. It's running well and I see my metal view being updated. However, when I close the window (either through the user clicking the close button or by programatically using the appropriate @Environment(\.dismissWindow) private var dismissWindow) and then reopen the window, I no longer receive calls to MTKViewDelegate draw(in mtkView: MTView). If I manually call the MTKView::draw() function my view updates it's content as expected, so it seems to be still be correctly setup / alive. As best as I can tell the CVDisplayLink created by MTKView is no longer active (or at least that's my understanding of how the MTKView::draw() function is called). I've setup the MTKView like this let mtkView = MTKView() mtkView.delegate = context.coordinator // My custom delegate mtkView.device = context.coordinator.device // The default metal device mtkView.preferredFramesPerSecond = 60 mtkView.enableSetNeedsDisplay = false mtkView.isPaused = false which I was hoping would call the draw function at 60fps while the view is visible. I've also verified the values don't change while running. Does anyone have any ideas on how I could restart the CVDisplayLink or anyother methods to avoid this problem?? Cheers Jack
2
0
590
Nov ’24
Metal-cpp-extensions isn't working inside frameworks
I am making a framework in C++ using metal-cpp, basically a small game engine. I am also consequently using metal-cpp-extensions provided in LearnMetalCPP to make applications work. For one of my classes, I needed to add AppKit.hpp inside a public header file, so I moved it and its associate headers(NSApplication.hpp, NSMenu.hpp, etc.) from Project headers to Public in Build Phases' Headers, however, it started giving me the error "cast of C pointer type 'void *' to Objective-C pointer type 'Class' requires a bridged cast" at several points in the AppKit headers. They don't appear when AppKit and its associates are in the Project headers, or when they are in the Private headers and no headers import it. I imagined that disabling Objective-C ARC and Using __bridge casts outside of ARC in Build Settings would solve it, but it didn't budge. I imagined it wouldn't involve actively changing the headers would be the answer, but even if I try to put __bridge before the problematic casts, it didn't recognize __bridge. How do I solve this? And why is it only happening in Public and not Project headers?
2
0
763
Jan ’25
M1 GPU violates atomic_thread_fence across threadgroups
I have an M1 Pro with a 16-core GPU. When I run a shader with 8193 threads, atomic_thread_fence is violated across the boundary between thread 8191 (the last thread in the 7th threadgroup) and 8192 (the first thread in the 9th threadgroup). I've attached the Metal and Swift files, but I'll repost the relevant kernel here. It's a function that launches N threads to iterate through a binary tree from the leaves, where the first thread to reach the parent terminates and the second one populates it with the sum of the nodes two children. // clang-format off void sum(device const int& size, device const int* __restrict__ in, device int* __restrict__ out, device atomic_int* visited, uint i [[thread_position_in_grid]]) { // clang-format on int val = in[i]; uint cur = (size + i - 1); out[cur] = val; atomic_thread_fence(mem_flags::mem_device, memory_order_seq_cst); cur = (cur - 1) / 2; int proceed = atomic_fetch_add_explicit(&visited[cur], 1, memory_order_relaxed); while (proceed == 1) { uint left = 2 * cur + 1; uint right = 2 * cur + 2; uint val_left = out[left]; uint val_right = out[right]; uint val_cur = val_left + val_right; out[cur] = val_cur; if (cur == 0) { break; } cur = (cur - 1) / 2; atomic_thread_fence(mem_flags::mem_device, memory_order_seq_cst); proceed = atomic_fetch_add_explicit(&visited[cur], 1, memory_order_relaxed); } } What I'm observing is that thread 8192 hits the atomic_fetch_add first and terminates, while thread 8191 hits it second (observes that thread 8192 had incremented it by 1) and proceeds into the loop. Thread 8191 reads out[16383] (which it populated with 8191) and out[16384] (which thread 8192 populated with 8192 prior to the atomic_thread_fence). Instead of reading 8192 from out[16384] though, it reads 0. Maybe I'm missing something but this seems like a pretty clear violation of the atomic_thread_fence which (I thought) was supposed to guarantee that the write from thread 8192 to out[16384] would be visible to any thread observing the effects of the following atomic_fetch_add. Is atomic_fetch_add not a store operation? Modifying it to an atomic_store or atomic_exchange still results in the bug. Adding another atomic_thread_fence between the atomic_fetch_add and reading of out also doesn't change anything. I only begin to observe this on grid sizes of 8193 and upwards. That's 9 threadgroups per grid, which I assume could be related to my M1 Pro GPU having 16 cores. Running the same example on an A17 Pro GPU doesn't show any of this behavior up through a tested grid size of 4194303 (2^22-1), at which point testing larger grid sizes starts to run into other issues so I can't test anything larger. Removing the atomic_thread_fences on both the M1 and A17 cause the test to fail at much smaller grid sizes, as expected. sum.metal main.swift
2
0
484
Dec ’24
metal-cpp syntax for MTL::Buffer float2 parameter
I'm trying to pass a buffer of float2 items from CPU to GPU. In the kernel, I can provide a parameter for the buffer: device const float2* values for example. How do I specify float2 as the type for the MTL::Buffer? I managed to get the code to work by "cheating" by defining a simple class that has the same data members as a float2, but there is probably a better way. class Coord_f { public: float x{0.0f}; float y{0.0f}; }; then using code to allocate like this: NS::TransferPtr(device->newBuffer(n_elements * sizeof(Coord_f), MTL::ResourceStorageModeManaged)) The headers for metal-cpp do not appear to define vector objects like float2, but I'm doubtless missing something. Thanks.
2
0
617
Jan ’25
Shader compiler crash on MacOS Sequoia+Radeon
Hello, Apple! This post is a bug report for Metal driver in MacOS Sequoia. I'm working on opensource game engine and one of my users reported a bug, on "MacOS Sequoia" + "AMD Radeon RX 6900 XT".  Engine crashes, when liking a compute pipeline, with following NSError: "Compiler encountered an internal error: I" Offended shader (depth aware blur): https://shader-playground.timjones.io/27565de40391f62f078c891077ba758c On my end, compiling same shader on M1 (Sonoma) or with offline compiler doesn't reproduce the issue. Post with bug-report on github: https://github.com/Try/OpenGothic/issues/712 Looking forward for your help and driver fix ;)
2
0
657
Jan ’25
Learn Metal
I am interested in learning the Metal framework for rendering development. However, most of Apple’s official documentation uses Objective-C code. Therefore, I am seeking guidance on whether it is more advantageous for me to focus solely on learning Swift to gain proficiency in Metal.
2
0
787
Jan ’25
Black Screen in GPTK – DX 12.1 / Shader Model 6.5 Issue?
Hey everyone, I’m trying to run Kingdom Come: Deliverance 2 using the Game Porting Toolkit, but I’m encountering a black screen when launching the game. From what I know about the game’s requirements, it might be using Shader Model 6.5, which supports advanced features like DirectX Raytracing (DXR) Tier 1.1. This leads me to suspect that the issue could be related to missing support for DirectX 12.1 Features or Shader Model 6.5 in GPTK. Does anyone know if these features are currently supported by GPTK? If not, are there any plans to implement them in future updates? Alternatively, is there any workaround for games that rely on Shader Model 6.5 and ray tracing? Thanks a lot for your help!
2
0
617
Feb ’25
Metal calls hanging/stuck if app is started quickly after login
Our app uses Metal for image processing. We have found that if our app (and its possible intensive image processing) is started quickly after user is logged in, then calls to Metal may be hanging/stuck for a good while. Example: it can take 1-2 minutes for something that usually takes 3-5 seconds! Metal threads are just hanging in a memmove... In Activity Monitor we see a lot of things are happening right after log-in. But why Metal calls are blocking for so long is unknown to us... The workaround is to wait a minute before we start our app and start intensive image processing using Metal. But hard to explain this workaround to end-users... It doesn't happen on all computers but fairly easy to reproduce on some computers. We are using macOS 15.3.1. M1/M3 Max. Any good ideas for how to proceed with this problem and possible reach out to Apple engineers? Thanks! :)
2
0
411
Feb ’25
Metal: Non-uniform thread groups unsupported in Simulator? Is it?
My app is running Compute Shaders that use non-uniform thread groups. When I run the app in the debugger with a simulator target the app crashes on encoder.dispatchThreads and the error message is: Dispatch Threads with Non-Uniform Threadgroup Size is not supported on this device. Previously the log output states that: Metal Shader Validation is unsupported for Simulator. However: When I stop the debugger and just run the app in the simulator without the debugger attached, the app just runs fine and does not crash. The SwiftUI Preview that also triggers the Compute Shader when preparing data also just runs fine without a crash. I can run and debug on a real device no problem - I just don't have all sizes available. Is there anything I need to check in my lldb/simulator configuration? It obviously does work, just the debugger cannot really deal with it? Any input would be nice as this really slows my down as I have to be extremely careful when debugging on the simulator.
2
0
544
Mar ’25