Render advanced 3D graphics and perform data-parallel computations using graphics processors using Metal.

Metal Documentation

Posts under Metal subtopic

Post

Replies

Boosts

Views

Activity

MTLBinaryArchive Size
I'm trying to use MTLBinaryArchive. I collected a BinaryArchive from one device and used metal-tt to translate it for all supported iPhone devices, ranging from iPhone 7 Plus to iPhone 16. However, this BinaryArchive is quite large, around 1.5GB uncompressed, and about 500MB compressed in the IPA. I'm wondering how to address the size issue. I watched the WWDC 2022 video, which mentioned that the operating system or app installation process would handle compatibility. Does this compatibility support different GPU chips? I tried installing an IPA with a BinaryArchive collected only from an iPhone 12 on an iPhone 13, but the BinaryArchive didn't take effect. I also saw that Apple supports App Thinning. However, it seems that resources in the Asset Catalog cannot be accessed via URL, and creating an MTLBinaryArchive requires a URL. Is it possible for MTLBinaryArchive to be distributed through App Thinning? The WWDC 2022 video also mentioned using the -Os optimization flag to reduce size. Can this give an estimate of how much compression it would achieve? Are there any methods to solve the BinaryArchive size issue without impacting performance?
0
0
16
2d
How can I get pixel coordinates in the fragment tile function?
In this video, tile fragment shading is recommended for image processing. In this example, the unpack function takes two arguments, one of which is RasterizerData. As I understand it, this is the data passed to us from the previous stage (Vertex) of the graphics pipeline. However, the properties of MTLTileRenderPipelineDescriptor do not include an option for specifying a Vertex function. Therefore, in this render pass, a mix of commands is used: first, a draw command is executed to obtain UV coordinates, and then threads are dispatched. My question is: without using a draw command, only dispatch, how can I get pixel coordinates in the fragment tile function? For the kernel tile function, everything is clear. typedef struct { float4 OPTexture [[ color(0) ]]; float4 IntermediateTex [[ color(1) ]]; } FragmentIO; fragment FragmentIO Unpack(RasterizerData in [[ stage_in ]], texture2d<float, access::sample> srcImageTexture [[texture(0)]]) { FragmentIO out; //... // Run necessary per-pixel operations out.OPTexture = // assign computed value; out.IntermediateTex = // assign computed value; return out; }
1
0
67
4d
why GLDContextRec::flushContextInternal() leads to abort
The flushContextInternal function in glr_sync.mm:262 called abort internally. What caused this? Was it due to high device temperature or some other reason? Date/Time: 2024-08-29 09:20:09.3102 +0800 Launch Time: 2024-08-29 08:53:11.3878 +0800 OS Version: iPhone OS 16.7.10 (20H350) Release Type: User Baseband Version: 8.50.04 Report Version: 104 Exception Type: EXC_CRASH (SIGABRT) Exception Codes: 0x0000000000000000, 0x0000000000000000 Triggered by Thread: 0 Thread 0 name: Thread 0 Crashed: 0 libsystem_kernel.dylib 0x00000001ed053198 __pthread_kill + 8 (:-1) 1 libsystem_pthread.dylib 0x00000001fc5e25f8 pthread_kill + 208 (pthread.c:1670) 2 libsystem_c.dylib 0x00000001b869c4b8 abort + 124 (abort.c:118) 3 AppleMetalGLRenderer 0x00000002349f574c GLDContextRec::flushContextInternal() + 700 (glr_sync.mm:262) 4 DiSpecialDriver 0x000000010824b07c Di::RHI::onRenderFrameEnd() + 184 (RHIDevice.cpp:118) 5 DiSpecialDriver 0x00000001081b85f8 Di::Client::drawFrame() + 120 (Client.cpp:155) 2024-08-27_14-44-10.8104_+0800-07d9de9207ce4c73289507e608e5de4320d02ccf.crash
1
0
46
5d
MetalFx
Recently, I adopted MetalFX for Upscale feature. However, I have encountered a persistent build failure for the iOS Simulator with the error message, 'MetalFX is not available when building for iOS Simulator.' To address this, I modified the MetalFX.framework status to 'Optional' within Build Phases > Link Binary With Libraries, adding the linker option (-weak_framework). Despite this adjustment, the build process continues to fail. Furthermore, I observed that the MetalFX sample application provided by Apple, specifically the one found at https://developer.apple.com/documentation/metalfx/applying-temporal-antialiasing-and-upscaling-using-metalfx, also fails to build for the iOS Simulator target. Has anyone encountered this issue?
3
0
462
2w
VRAM not freeing in Elite Dangerous
So I've been trying out GPTK with Elite Dangerous Horizons game and it looks like from what I can tell. The VRAM keeps going up until it goes over the limit where it drops the FPS to 1-3 FPS and then crashes the game. From the Performance HUD I can see that it looks like when using GPTK, the VRAM usage just keeps climbing and I never saw it drop down at all. I did some limited testing, and from that I think I can conclude that it is probably not a VRAM leak, but it might be caching it. The reason for this is because I noticed that if I went back to the area that I've been before. It won't increase the VRAM usage. So either there is something wrong with the freeing VRAM memory part, or it could be that GPTK might not be reporting the right amount of VRAM available to use? So maybe that's why it keeps allocating VRAM until it went out of memory and crashed the game. Just to test, I did try running the game with DXVK+MoltenVK combo, and I can see that it works just fine. VRAM is being freed up when it's no longer used. Is this a known issue in some games?
7
1
625
2w
How to properly pass a Metal layer from SwiftUI MTKView to C++ for use with metal-cpp?
Hello! I'm currently porting a videogame console emulator to iOS and I'm trying to make the renderer (tested on MacOS) work on iOS as well. The emulator core is written in C++ and uses metal-cpp for rendering, whereas the iOS frontend is written in Swift with SwiftUI. I have an Objective-C++ bridging header for bridging the Swift and C++ sides. On the Swift side, I create an MTKView. Inside the MTKView delegate, I run the emulator for 1 video frame and pass it the view's backing layer for it to render the final output image with. The emulator runs and returns, but when it returns I get a crash in Swift land (callstack attached below), inside objc_release, which indicates I'm doing something wrong with memory management. My bridging interface (ios_driver.h): #pragma once #include <Foundation/Foundation.h> #include <QuartzCore/QuartzCore.h> void iosCreateEmulator(); void iosRunFrame(CAMetalLayer* layer); Bridge implementation (ios_driver.mm): #import <Foundation/Foundation.h> extern "C" { #include "ios_driver.h" } <...> #define IOS_EXPORT extern "C" __attribute__((visibility("default"))) std::unique_ptr<Emulator> emulator = nullptr; IOS_EXPORT void iosCreateEmulator() { ... } // Runs 1 video frame of the emulator and IOS_EXPORT void iosRunFrame(CAMetalLayer* layer) { void* layerBridged = (__bridge void*)layer; // Pass the CAMetalLayer to the emulator emulator->getRenderer()->setMTKLayer(layerBridged); // Runs the emulator for 1 frame and renders the output image using our layer emulator->runFrame(); } My MTKView delegate: class Renderer: NSObject, MTKViewDelegate { var parent: ContentView var device: MTLDevice! init(_ parent: ContentView) { self.parent = parent if let device = MTLCreateSystemDefaultDevice() { self.device = device } super.init() } func mtkView(_ view: MTKView, drawableSizeWillChange size: CGSize) {} func draw(in view: MTKView) { var metalLayer = view.layer as! CAMetalLayer // Run the emulator for 1 frame & display the output image iosRunFrame(metalLayer) } } Finally, the emulator's render function that interacts with the layer: void RendererMTL::setMTKLayer(void* layer) { metalLayer = (CA::MetalLayer*)layer; } void RendererMTL::display() { CA::MetalDrawable* drawable = metalLayer->nextDrawable(); if (!drawable) { return; } MTL::Texture* texture = drawable->texture(); <rest of rendering follows here using the drawable & its texture> } This is the Swift callstack at the time of the crash: To my understanding, I shouldn't be violating ARC rules as my bridging header uses CAMetalLayer* instead of void* and Swift will automatically account for ARC when passing CoreFoundation objects to Objective-C. However I don't have any other idea as to what might be causing this. I've been trying to debug this code for a couple of days without much success. If you need more info, the emulator code is also on Github Metal renderer: https://github.com/wheremyfoodat/Panda3DS/blob/ios/src/core/renderer_mtl/renderer_mtl.cpp#L58-L68 Bridge implementation: https://github.com/wheremyfoodat/Panda3DS/blob/ios/src/ios_driver.mm Bridging header: https://github.com/wheremyfoodat/Panda3DS/blob/ios/include/ios_driver.h Any help is more than appreciated. Thank you for your time in advance.
0
0
331
2w
Question about metal-cpp resource allocation
I notice some metal-cpp classes have static funtion like static URL* fileURLWithPath(const class String* pPath); static class ComputePassDescriptor* computePassDescriptor(); static class AccelerationStructurePassDescriptor* accelerationStructurePassDescriptor(); which return a new object. these classes also provide 'alloc' and 'init' function to create object by default. for object created by 'alloc' and 'init', I use something like NS::Shaderd_Ptr or call release directly to free memory. Because 'alloc' and 'init' not explicit call on these static function. I wonder how to correctly free object created by these static function? did they managed by autorelease pool?
2
0
371
3w
Metal-CPP Errors
After following the instructions here: https://developer.apple.com/metal/cpp/ I attempted building my project and Xcode presented several errors. In essence it's complaining about some redeclarations in the Metal-CPP headers. NSBundle.hpp and NSError.hpp are included in the metal-cpp/foundation directory from the metal-cpp download. Any help in getting these issues resolved is appreciated. Thanks!
2
0
426
3w
Xcode Playground - The LLDB RPC server has crashed.
I am trying to learn Metal development on my MacBook Pro M1 Pro (Sequoia 15.3.1) on Xcode Playground, but when I write these two lines of code: import Metal let device = MTLCreateSystemDefaultDevice()! I get the error The LLDB RPC server has crashed. Any ideas as to what I can do to solve this? I have rebooted the machine and reinstalled Xcode...
2
0
294
3w
3D Skeletal animation in metal-cpp?
Hey all! I'm got my hands on a refurbished mac mini m1 and already diving into metal. At the moment, i'm currently studying graphics programming with opengl and got to a point where I can almost create a 3d cube. However, I noticed there aren't many tutorials for metal cpp but rather demos. One thing I love about graphic programming, is skinning/skeletal animation. At the moment, I can't find any sources or tutorials on how to load skeletal animations into metal-cpp. So, if I create my character in blender and had all types of animations all loaded into a .FBX or maybe .DAE and load this into metal api with metal-cpp, how can I go on about how this works?
1
0
301
3w
Metal: Non-uniform thread groups unsupported in Simulator? Is it?
My app is running Compute Shaders that use non-uniform thread groups. When I run the app in the debugger with a simulator target the app crashes on encoder.dispatchThreads and the error message is: Dispatch Threads with Non-Uniform Threadgroup Size is not supported on this device. Previously the log output states that: Metal Shader Validation is unsupported for Simulator. However: When I stop the debugger and just run the app in the simulator without the debugger attached, the app just runs fine and does not crash. The SwiftUI Preview that also triggers the Compute Shader when preparing data also just runs fine without a crash. I can run and debug on a real device no problem - I just don't have all sizes available. Is there anything I need to check in my lldb/simulator configuration? It obviously does work, just the debugger cannot really deal with it? Any input would be nice as this really slows my down as I have to be extremely careful when debugging on the simulator.
2
0
450
Feb ’25
Is Metal usable from Swift 6?
Hello ladies and gentlemen, I'm writing a simple renderer on the main actor using Metal and Swift 6. I am at the stage now where I want to create a render pipeline state using asynchronous API: @MainActor class Renderer { let opaqueMeshRPS: MTLRenderPipelineState init(/*...*/) async throws { let descriptor = MTLRenderPipelineDescriptor() // ... opaqueMeshRPS = try await device.makeRenderPipelineState(descriptor: descriptor) } } I get a compilation error if try to use the asynchronous version of the makeRenderPipelineState method: Non-sendable type 'any MTLRenderPipelineState' returned by implicitly asynchronous call to nonisolated function cannot cross actor boundary Which is understandable, since MTLRenderPipelineState is not Sendable. But it looks like no matter where or how I try to access this method, I just can't do it - you have this API, but you can't use it, you can only use the synchronous versions. Am I missing something or is Metal just not usable with Swift 6 right now?
1
0
511
Feb ’25
Implementing Scalable Order-Independent Transparency (OIT) in Metal
Hi, Apple’s documentation on Order-Independent Transparency (OIT) describes an approach using image blocks, where an array of size 4 is allocated per fragment to store depth and color in a tile shading compute pass. However, when increasing the scene’s depth complexity by adding more overlapping quads, the OIT implementation fails due to the fixed array size. Is there a way to dynamically allocate storage for fragments based on actual depth complexity encountered during rasterization, rather than using a fixed-size array? Specifically, can an adaptive array of fragments be maintained and sorted by depth, where the size grows as needed instead of being limited to 4 entries? Any insights or alternative approaches would be greatly appreciated. Thank you!
1
0
474
Feb ’25
Why is depth/stencil buffer loaded/stored twice in xcode gpu capture?
I used xcode gpu capture to profile render pipeline's bandwidth of my game.Then i found depth buffer and stencil buffer use the same buffer whitch it's format is Depth32Float_Stencil8. But why in a single pass of pipeline, this buffer was loaded twice, and the Load Attachment Size of Encoder Statistics was double. Is there any bug with xcode gpu capture?Or the pass really loaded the buffer twice times?
1
0
314
Feb ’25
Metal calls hanging/stuck if app is started quickly after login
Our app uses Metal for image processing. We have found that if our app (and its possible intensive image processing) is started quickly after user is logged in, then calls to Metal may be hanging/stuck for a good while. Example: it can take 1-2 minutes for something that usually takes 3-5 seconds! Metal threads are just hanging in a memmove... In Activity Monitor we see a lot of things are happening right after log-in. But why Metal calls are blocking for so long is unknown to us... The workaround is to wait a minute before we start our app and start intensive image processing using Metal. But hard to explain this workaround to end-users... It doesn't happen on all computers but fairly easy to reproduce on some computers. We are using macOS 15.3.1. M1/M3 Max. Any good ideas for how to proceed with this problem and possible reach out to Apple engineers? Thanks! :)
2
0
371
Feb ’25
Black Screen in GPTK – DX 12.1 / Shader Model 6.5 Issue?
Hey everyone, I’m trying to run Kingdom Come: Deliverance 2 using the Game Porting Toolkit, but I’m encountering a black screen when launching the game. From what I know about the game’s requirements, it might be using Shader Model 6.5, which supports advanced features like DirectX Raytracing (DXR) Tier 1.1. This leads me to suspect that the issue could be related to missing support for DirectX 12.1 Features or Shader Model 6.5 in GPTK. Does anyone know if these features are currently supported by GPTK? If not, are there any plans to implement them in future updates? Alternatively, is there any workaround for games that rely on Shader Model 6.5 and ray tracing? Thanks a lot for your help!
2
0
515
Feb ’25
Xcode Metal geometry inspector uses wrong NDC space?
When inspecting the geometry in Xcode's metal debugger, I noticed that the shown "frustrum box" didn't make sense. Since Metal uses depth range 0,1 in NDC space, I would expect a vertex that is projected to z:0 to be on the front clipping plane of the frustrum shown in the geometry inspector. This is however not the case. A vertex with ndc z:0 is shown halfway inside the frustrum. Vertices with ndc z less than 0 are correctly culled during rendering, while the geometry inspector's frustrum shows that the vertex is stil inside the frustrum. The image shows vertices that are visually in the middle of the frustrum on z axis, but at the same time the out position shows that they are projected to z:0. How is this possible, unless there's a bug in the geometry inspector?
1
0
451
Feb ’25
Concurrent conflicting texture writes
Hello! I need to "draw" a set of particles into the texture. It would be trivial in render encoder of course. However, I would like to implement the task in compute kernel. Every particle draw operation is expected to set 5 texels - "center" one and left/right/upper/lower. Particles can and will overlap, so concurrent draws are to be expected. I tried using texture atomics - atomic_store() to be more precise. This worked, albeit pretty slowly - too slow for my purpose. Just to test what would happen, I tried using normal texture write(). I was expecting to see some kind of visual artefacts, but to my surprise, it worked very well (and much faster). My question: is it safe? I understand that calling write() doesn't guarantee any ordering of the operations, so if multiple threads write to the same texel, the final value may come from any of those threads. But suppose all the threads were to write the very same color? Can I assume that the texel in question will have said color after the compute kernel finishes? I am using M2 Pro MacBook, but ideally I would love to get the answer for the all Apple Silicon devices. My texture format is R32Int (so as to be able to use atomics), but I could do with any single-channel format, the purpose of the texture is to be binary mask of sorts. Thanks!
0
0
354
Feb ’25
Learn Metal
I am interested in learning the Metal framework for rendering development. However, most of Apple’s official documentation uses Objective-C code. Therefore, I am seeking guidance on whether it is more advantageous for me to focus solely on learning Swift to gain proficiency in Metal.
2
0
708
Jan ’25