Render efficiently to multiple layers or viewports.
Sometimes you need to generate multiple primitives from a single set of input data. For example, if you are implementing a graphics technique like cascade shadow maps, you might render the same model objects multiple times, once for each cascade level. Although you could do so with multiple render passes, this approach requires you to encode the same drawing commands for each render pass, and may require the GPU to fetch the input data from memory multiple times.
Using vertex amplification, you create drawing commands that generate multiple vertex streams from your input data.
When the GPU executes a command with vertex amplification, it sends multiple primitives to the rasterizer. The GPU calls your vertex function multiple times, once per vertex for each output stream. However, if a calculation for a field can be shared because the calculation is the same on all of the vertex outputs, the GPU calculates the value only once and shares it, reducing the GPU's workload. When you write your vertex function, the compiler automatically detects when custom values must be calculated separately, but you can also explicitly mark calculations as shared.
Vertex amplification is usually used in conjunction with layered rendering or rendering to multiple viewports, so that the GPU renders each output primitive to a different texture layer or viewport. For more information, see Rendering to Multiple Texture Slices in a Draw Command or Rendering to Multiple Viewports in a Draw Command.
Check for Vertex Amplification Support
Not all GPUs support vertex amplification. Check for support by calling the
supports method on a device object, passing in the number of output streams you want to create. If the device object can support that many streams, this method returns
Add Vertex Amplification to Your Vertex Shader
To implement vertex amplification:
amplificationattribute to a shader argument to get the number of requested output streams. You'll set this count when you encode a draw command, as shown below in Set Amplification Information Before Encoding a Draw Command.
amplificationattribute to a parameter to get the index of the output stream. The indices have values from
By default, the GPU calls your vertex function once, with an amplification count of
1 and an amplification index of
To customize the behavior for each output stream, pass in per-stream input data, and use the index of the output stream to select the data. For example, the following shader takes the count and index as inputs, as well as an array of projection matrices, one for each output stream. It uses the index to select which matrix to use for that output.
Determine Which Calculations Must Be Distinct
A major benefit of vertex amplification over instancing is how it optimizes the work you send to the GPU. The GPU reads the input stream only once, and performs calculations as needed to produce the output streams. If the compiler determines that a particular calculation is shared across the vertex output streams, it calculates that output once. In some cases, you need to explicitly mark values in your shader so that the compiler knows an output value is shared.
Your output vertex data always includes a field with the
position attribute, and Metal always marks this data as nonshared. If you assign another built-in attribute to any field, Metal marks that field as shared.
The compiler marks other output values as shared only if it can prove that your shader compiles the output values the same way for all output streams. For example, if the compiler encounters a calculated field that's dependent on the amplification ID, it marks that field as nonshared:
On the other hand, if the value is just copied from the input stream, the compiler marks the output as shared:
To explicitly tell the compiler to calculate an output field once, add the
shared attribute to the field.
Configure the Pipeline State Object for Vertex Amplification
When you create a render pipeline state object for your shaders, set the
max property on the
MTLRender to the maximum number of output streams that your pipeline can handle.
Set Amplification Information Before Encoding a Draw Command
To use vertex amplification in a draw command, call
set before encoding the command, specifying the number of vertices to generate. The count must be less than or equal to the maximum value you set when you created the render pipeline.
In addition, because vertex amplification is almost always used to render to different layers or viewports, you typically must specify the index of the target for each output vertex. The render target and viewport array indices are always calculated once in the vertex shader (because they use a built-in attribute, as described above). However, you can modify the final indices for each output primitive by creating an array of offsets and passing it as the second parameter.
The following code creates two mappings and configures the draw call to use vertex amplification:
The following vertex shader sets the viewport array index to
1. After the GPU runs your shader, it adds the offsets provided above, so the primitive for the first output stream has a viewport array index of 2, and the second has a viewport array index of
Combine Vertex Amplification with Instancing
Primitive instancing is another way to generate multiple vertex output streams from a single stream of input data. You provide shared vertex data and data that specifies how you want to render each instance of the model. For example, you might use a single set of model data, but provide different pose data to animate each version of the model separately.
When you execute a draw call with an instance count of
10, the GPU generates ten output streams. Primitive instancing, unlike vertex amplification, recalculates all of the vertex outputs for each call to the vertex function.
You can combine vertex amplification and primitive instancing safely and easily. Use this combination to separate instancing concepts (such as the number of characters in a scene) from rendering concepts (such as the distinction between shadow map targets). Metal generates a number of output streams equal to the product of the vertex amplification count and the instance count. For example, if you execute a draw call with a vertex amplification count of
2 and an instance count of
10, the GPU calls your vertex function
20 times—twice for each instance. It calculates the shared output values from vertex amplification once per instance.