View the elapsed execution time of individual statements in your shader to understand where it spends the most time.
Using the Shader Profiler, you can prioritize your optimization efforts by reducing the time taken by your longest running shader statements. The shader profiler helps you optimize your shader by showing you how long each statement took to complete. On devices with a Family 4 or later GPU, a pie chart details which GPU activity your shader does most, which provides additional hints about improving performance. With the Update Shaders feature, you can change your shader source code live and quickly see how well your shader performs after the change.
Set Up Your Project to Enable the Shader Profiler
To use the shader profiler on your project, set up the
.metallib file to allow for debugging:
In Xcode, navigate to your project's build settings.
For the Debug build configuration, set "Produce debugging information" to "Yes, include source code."
Use the shader profiler within a captured
Metal frame. Most commonly, you capture a
Metal frame by clicking the camera button on Xcode's debug bar as covered in Performing a GPU Capture from the Debug Bar. For more ways to capture a
Metal frame, see Metal GPU Capture.
From the captured frame, open the shader profiler using the steps in Figure 1:
In the Debug navigator, choose View Frame By Performance.
View your render pipelines populated in the list.
Observe the amount of time each one took during the frame.
Click the disclosure triangle to expand a shader and see the time taken by any inline functions it called. Figure 2 shows that the inline function
sample took about 134 microseconds (about 42%) of the total time taken by
fragment (about 318 microseconds).
Profile a Shader
Profile a shader using the following steps, and as annotated in Figure 3:
Expand the render pipeline.
Select the shader you want to profile.
View the shader source code in the center pane with the function entry point highlighted.
Examine the times and percentages column.
Because profiling is for performance tuning, most often you'll inspect the render pipeline and shader that took the longest to complete.
In the times and percentages column, the time marking the function entry point is the shader's total elapsed time. Inside of the shader function, a percentage marks each statement and indicates what time (as a percent) of the elapsed time that statement took.
Interpret the GPU Activity Metrics
Next to the percentage of time taken, a pie chart details which activity the GPU is doing most during the statement.
Place your mouse pointer over the dot to bring up the pie chart, as shown in Figure 4.
A high percentage in one GPU activity can indicate a performance bottleneck, and an opportunity for optimization. See the following explanations based on state:
Time spent in the GPU's arithmetic logic unit. Changing floats to half floats where possible is one way to reduce time spent in the ALU. Another is to minimize complex instructions, like
Time spent waiting for access to your app's buffers or texture memory. You can shorten this time by down-sampling textures, or, if you're not spending much time in Memory, you could improve your texture resolution instead.
Time spent in conditional, increment, or jump instructions as a result of branches or loops in your shader. Use a constant interation count to minimize Control Flow time for loops because the
Time spent waiting for a required resource or event before execution could begin. Synchronization types are described below.
Synchronization (wait memory)
Waiting for dependent memory accesses issued in prior instructions, such as texture sampling or buffer read/write.
Synchronization (wait pixel)
Waiting for underlapping pixels to release resources. In addition to color attachments, pixels could be from depth or stencil buffers or user-defined resources. Blending is a common cause of pixel waiting. Use raster order groups to reduce time spent waiting for pixels.
The thread reached a barrier and waits for remaining threads in the same group to arrive at the barrier before proceeding.
Time spent on atomic instructions.
Update Shaders Live
After making a change to a shader you can apply the update live using the Update Shaders button highlighted in Figure 5.
The Update Shaders button applies the source code changes you make to the same captured
Metal frame. The updates reflect as follows:
The application window is redrawn.
Elapsed time and percentage metrics are recalculated.
Attachments in the Assistant Editor are redrawn.
Because Updating Shaders maintains your view in the captured
Metal frame, you can easily make successive changes to your shader source code for iterative optimization.