Bring your game to Mac, Part 2: Compile your shaders
Discover how the Metal shader converter streamlines the process of bringing your HLSL shaders to Metal as we continue our three-part series on bringing your game to Mac. Find out how to build a fast, end-to-end shader pipeline from DXIL that supports all shader stages and allows you to leverage the advanced features of Apple GPUs. We'll also show you how to reduce app launch time and stutters by generating GPU binaries with the offline compiler.
To get the most out of this session, we recommend first watching “Bring your game to Mac, Part 1: Make a game plan." And once you're ready to level up, check out “Bring your game to Mac, Part 2: Render with Metal" from WWDC23.
♪ ♪ Hello, and welcome. I'm Varun Subramanian, an Engineer in GPU, Graphics, and Display Software group at Apple.
This session is the second part of a 3-part series on bringing your high-end games to Mac. The first session, covers how you can evaluate your game and "make a game plan". This session focuses on your shaders and how to improve their flexibility and speed with some of the new Metal compiler tools, including new ways to convert your shaders to Metal intermediate representation and how to avoid on-device compilation by finalizing GPU binaries during your games' build time. The Metal compiler toolchain helps you compile the shaders that power your game and Metal now makes that easier than ever.
Creating the Metal IR on device is suboptimal because it adds compilation overhead before the GPU can do the work you need it to do. Metal provides you with tools necessary to generate Metal IR from your Metal Shading Language ahead of time. That Metal IR is stored as a part of a Metal library. You should always aim to generate your Metal libraries using the Metal compiler toolchain ahead of time.
However, when coming from another API and shading language, you need a way to get them to Metal. If you are a new comer to the Mac, you now have Metal Shader Converter. This simplifies your shader pipelines and allows you to package the generated Metal libraries directly in your bundle, avoiding on-device generation of Metal IR. The Metal libraries generated are the same as those generated from the Metal compiler allowing your converted shaders to natively integrate with the Metal API. Use the new tool to convert your existing shaders to Metal libraries that you then ship with your game. Metal Shader Converter provides a robust feature set to improve the experience of converting your shaders to Metal. It consumes DXIL to produce Metal IR. You use it alongside the open source DXC compiler tool to build an end-to-end shader pipeline. Converting from DXIL to Metal IR is very fast because Metal shader converter performs the conversion at the binary level. As a result you will have reduced shader asset build times. It also enables you to leverage advanced features of Apple GPUs. You can do this because of the rich feature set of Metal shader converter, which supports all the traditional and modern shader stages of your existing DXIL shaders. Using Metal shader converter, you can convert your shaders for the traditional graphics pipeline including tessellation and geometry shaders to Metal libraries. It also supports compute shaders, as well as more recently introduced ray tracing stages and shaders and amplification and mesh shaders. Now, I'll walk you through how to use the Metal shader converter.
There are two scenarios where you may want to convert your shaders via command line. Using the command line tool via terminal is a good mechanism to convert one shader at a time. If you have multiple shaders, you can create a shell script that calls Metal shader converter to transform multiple shaders for you automatically. Converting your shader using the command line tool is very easy. After you set up DXC and shader converter, start by compiling your HLSL shader to DXIL. DXC requires you to specify the entry point to compile, the type of shader, and the output file. Next, call shader converter on the DXIL file just created and specify the output Metal library to create. By default, shader converter generates a Metal library for the latest version of macOS, as well as JSON file with useful reflection data. At runtime, you pass this Metal library to the Metal device to load it and build pipeline state objects. There are two other scenarios where the command line interface may not be the best choice for your workflow. Some game engines have custom asset build programs that compile and package shaders into game-specific formats. Also, in some situations, as you bootstrap your game for Metal, you may also want to see how well your shaders are working on the platform before you fully turn them into Metal libraries ahead of time. For these last two cases, you need a way to better integrate Metal shader converter into your workflow.
To accomplish this, use the Metal shader converter dynamic library. It exposes all the same functionality as the CLI tool to help you generate Metal libraries. The library offers a pure C interface and just like the CLI, it is available on both macOS and Windows, so it's easy to integrate into your existing workflows. After you've converted your shaders to Metal IR, to integrate them into your game you create pipeline states and bind resources to them.
In your shaders, you typically define resources as global variables and assign "register" declarations to them. From the API side, your game either binds resources directly to these slots, or defines an explicit memory layout via "root signatures". Shader converter can help you bring this model over, because Metal has a very flexible binding model. The tool lays out these resources into Argument buffers. In this model, you bind one argument buffer directly to your pipeline and reference your resources through it. There are two layout modes for this "top level" Argument buffer that you can choose from, to best suit your game.
The simplest layout you can create is an automatic one, where shader converter places your resources one after the other. Once you create a pipeline state containing your shader, you bind a single argument buffer, and through it you reference all your resources. Alternatively, shader converter supports explicitly defining a layout that matches your root signature. Use this mode when your game needs to specify separate textures and samplers into their own resource tables or if your game uses bindless resources. You may also embed raw buffers and 32-bit constants directly into the top-level argument buffer, shown as 0's and 1's in this diagram.
Now, the top-level Argument Buffer is a resource shared between the CPU and the GPU, so as you write into it, you need to coordinate access to its memory to avoid a race condition that may cause visual corruption. You don't need to serialize CPU and GPU work to avoid this race condition. One way to avoid this is to use a bump allocator. This can be a large Metal buffer from which you sub-allocate different resources each frame. You then shadow the backing buffer for each frame in flight that your game handles. For more details on bump allocator implementation, check out our sample code. For best Argument buffer management practices, check out the bindless session from last year and Metal documentation. The binding model is not the only place where Metal shader converter can help you bring your shaders to the Mac. Mapping certain shader stages can be challenging due to differences in graphics APIs. For example, you may have pipelines that leverage traditional geometry and tessellation stages. Metal is a modern API, and it offers features such as viewport ID and amplification that makes older, less efficient stages from other graphics APIs unnecessary.
However, when your game relies on these pipelines for traditional effects that enhance some surfaces, like the grass rendered in this image, converting them by hand is costly. Metal shader converter helps you bring these pipelines to Metal, by mapping them to Mesh Shaders, a modern and more efficient graphics API construct.
The tool does the heavy lifting to bring these complex pipelines over to Metal easily, by mapping each stage to a Metal IR representation. This includes the tessellator, which is traditionally a fixed function operation. To support this workflow, this year Metal adds the capability of linking visible functions to the "object and mesh" stages of Mesh shaders. After you've compiled your shaders, you use them to build a "Metal Mesh RenderPipeline Descriptor" and compile it into a "Metal Render Pipeline state". When Metal receives the request to build this pipeline state, it compiles and links all the Metal IR, baking all functions into a single pipeline, completely avoiding function call overhead and maximizing performance during runtime. Notice the power and flexibility of Metal visible functions that allows you to build this elaborate render pipeline containing these shading stages with their supplemental functions. While building these mesh pipelines is straightforward, every pipeline has to follow a series of steps in precise order.
The shader converter runtime helps you build these complex pipelines. It even emulates draw calls by dispatching mesh shading work. For more information, please consult the Metal Shader Converter documentation. Now that your shaders are on Metal and you are running your pipeline states, here are some tips to help you get great performance and visual correctness.
Shaders compiled with shader converter reference Metal resources indirectly. To flag resource residency to Metal, you would call "useResource". However, useResource is an expensive call when used in excess. Use the plural useResources to provide several resources at once, or consider using Metal heaps via useHeap to flag residency of several resources in a single call. The pipeline objects are cached when Metal compiles them for the first time, automatically reducing compilation-based hitching on subsequent runs of your game. Binary archives can also help you here. To get more out of the GPU and customize the Metal IR for higher performance, shader converter provides you options. There are customizations on compatibility, GPU family, vertex fetch behavior, entry point naming, reflection, and more. Here's one additional optimization opportunity.
I mentioned earlier that shader converter joins the Metal compiler as another mechanism to produce "Metal Libraries" from your existing shader IR. Metal uses these to feed various graphics pipeline stages. Since everything is Metal IR, you can mix and match "Metal Libraries" coming from Metal shader converter and from the Metal Compiler in a single app and even in a single pipeline. Metal shading language also enables you to access unique features like programmable blending. Use this approach to make the most out of Apple's GPUs. You can even take advantage of unique shading functionality such as tile shading. This grants you tremendous flexibility in how you bring your game to Metal. Performance is important but visual correctness of your game is paramount. HLSL allows seamlessly treating textures as arrays of one element.
To bring over shaders that rely on this behavior, create your textures as texture arrays, or create "texture array views" on your textures. If you are using "MetalKit Texture Loader", it can also help you load files as texture arrays. To set up your sampler objects and read from these textures, make sure to let Metal know in advance that you intend to reference samplers in argument buffers by using the property supportsArgumentBuffers in the MTL Sampler Descriptor. Now that you are familiar with Integrating shader converter in your workflows, here's how to get it. You can download Metal shader converter from developer.apple.com. If you are working on your Mac, get the Metal shader converter for Mac package. If you are working on Windows, it is part of the Metal Developer Tools for Windows package. The beta version of the tool is available now. Both packages contain Metal shader converter, in standalone and library form, as well as the runtime companion header. Full documentation as well as a Metal C++ code sample are available now. Use the sample code to explore geometry and tessellation emulation, instanced drawing, and compute shaders. Converting your shaders to Metal libraries that you ship with your game helps you avoid generating those libraries at game run-time. There is one additional optimization you may be able to perform, which is to compile your GPU binaries ahead of time as well. When you build your game, you compile your shaders into Metal libraries which still need to be finalized into GPU binaries. Usually your game does this at launch, resulting in longer loading screens. If you defer finalizing GPU binaries at runtime, it may result in frame drops as the game compiles new pipelines on demand. The Metal GPU Binary compiler can help you solve this by allowing you to generate your shader binaries at game build time. By removing the need to generate shader binaries during gameplay, your players benefit from reduced app load time without incurring in additional GPU hitches.
To take advantage of this, you can add another step to your workflow to finalize your Metal libraries into Metal binary archives at build time. On device GPU Binary compilation happens when you create a pipeline state from a descriptor. This descriptor not only references functions from a Metal library, but it also provides other critical information to Metal such as the color format of its render attachments and the vertex layout descriptor. The GPU binaries are generated just-in-time as part of the PSO creation. Binary archives allow you to take control of when that compilation occurs. In order to produce GPU binaries ahead of time, you provide both your existing Metal Library as well as a pipeline configuration script referencing those libraries. You then provide both of them to metal-tt, producing a binary archive with GPU binaries. To develop the pipeline script, you produce a JSON script with pipeline configurations similar to the Metal API. This Metal code generates a render pipeline descriptor and alongside is its JSON equivalent representation. For your pipeline script, add the Metal library path as well as its fragment and vertex function names. You also specify any other pipeline state configuration. That's it, you now have a Metal script that you can use. You can find additional info about the JSON schema in Metal's developer documentation. Your ahead-of-time shader compilation workflow may not be geared to generate pipeline script files. For these cases, there's an alternative way to produce them.
You can record Metal binary archives while running your game on device. These archives include the corresponding pipeline scripts. If you harvest these archives from device, you can then use "metal-source" to extract their embedded pipeline scripts. You then update the paths to your Metal libraries in the extracted scripts. For more information, please refer to our talks on how you can "Build GPU binaries" as well as "Discover compilation workflows". Because GPU binaries are tailored to each GPU, "metal-tt" produces different versions of the binaries for you to distribute to your players, based on their device. Metal-tt helps you manage this complexity by encapsulating all the different GPU binaries neatly into Metal binary archives. This way, when your app loads that binary archive, Metal automatically picks the appropriate binary for your players. You can also encapsulate multiple sets of binaries into a single binary archive.
Now that you are able to produce binary archives ahead of time, here are some best practices. When your players run a Metal app with pre-compiled GPU Binaries, Metal searches the packaged binary archive for the necessary GPU binary. If Metal finds no match in the archive, it automatically falls back to on-device compilation. Your app will still look correct, but this may delay your submissions to the GPU. You can test if your binary archives contain the pipelines you expect, using the option "FailOnBinaryArchiveMiss". You can easily specify the FailOnBinaryArchiveMiss option when creating a Metal pipeline state object. In case of a binary archive miss, when you have this option set, Metal skips the on-device compilation and returns a nil pipeline state. Once your binary archives are ready with support for all target devices, you're ready for deployment. Not all of your players may be on the latest OS. To ensure all users benefit from binary archives, generate an archive for each major OS version and store it into your app. To accomplish this, check the OS version of your player's device and select the appropriate binary archive to associate with your Pipeline Descriptors. When your players update their OS, their binary archives may require recompilation for forward compatibility, but Metal has you covered. Metal identifies the unpackaged binary archives in app bundle on your player's device and upgrades them automatically in the background after an OS update or game install. In summary, "metal" compiler and "metal-shaderconverter" are your go-to tools, to produce Metal libraries ahead of time that you can then ship with your game. Use the "metal compiler" when compiling MSL source, and "metal-shaderconverter" when your shaders are in HLSL. metal-tt enables you to finalize Metal libraries into GPU binaries, tailored to the various GPUs in the Metal ecosystem. Finally, metal-source helps you harvest pipeline scripts from your existing MacOS games. The vast majority of these tools, as well as the rest of the GPU binary compiler toolchain, now support Windows, in addition to macOS, making it easier than ever to integrate them into your existing workflows. To wrap up: Metal Shader Converter is a new tool to help you bring your shaders that were developed in another shading language to Metal. The GPU binary compiler and its toolchain, which is now available on Windows, can finalize your Metal libraries into GPU binaries. With these tools, you now have everything you need to bring your shaders to Metal. There's still more to share with you. Part 3 of the series focuses on optimizing a high-end Metal application. Be sure to check it out. Thanks for watching. ♪ ♪