A Brief Tour of Metal

Learn the basics of GPU programming in Metal.


Metal provides low-level and low-overhead access to the graphics processing unit (GPU). The key to developing a great Metal app that uses the GPU efficiently is to understand the underlying software and hardware interactions holistically.

GPU Programming Basics

The GPU is a powerful hardware unit that contains many processing cores—far more than the CPU contains. Individually, each GPU core has fewer processing capabilities than a CPU core. However, the GPU can collectively process vast amounts of data by distributing workloads among its multiple cores.

These fundamental hardware differences make each processor more suitable for different tasks. The GPU is particularly well suited for massive parallel-processing tasks, such as rendering thousands of pixels in a frame or performing the same computation on thousands of elements in an array.

The following example demonstrates a simple GPU function (compute_function). This function adds the elements of two input buffers (inputA, inputB) and stores the result in a single output buffer (outputC).

using namespace metal;
kernel void
compute_function(constant float *inputA  [[buffer(0)]],
                 constant float *inputB  [[buffer(1)]],
                 device   float *outputC [[buffer(2)]],
                 uint           index    [[thread_position_in_grid]])
    outputC[index] = inputA[index] + inputB[index];

On the CPU, this operation is declared inside a loop and executed sequentially. On the GPU, the operation is distributed among its multiple cores and executed in parallel.

Fundamental Metal Components

The previous example code is a complete Metal GPU function. You only need the Metal shading language to write GPU-executable functions, but you need the Metal framework to specify GPU-accessible resources and GPU-centric commands.

The following simple Metal system diagram illustrates the flow of data into and out of the GPU. Functions and resources are encoded into coalesced commands, which are then submitted to and executed by the GPU. The results from the GPU are rendered or written to another set of resources, which are optionally sent to the display.

Metal system diagram illustrating the flow of data into and out of the GPU. The fundamental Metal components illustrated are functions, resources, commands, the GPU, and the display.

For more information, see Fundamental Components; in particular:

See Also

First Steps

Devices And Commands

Demonstrates how to access and interact with the GPU.

Hello Triangle

Demonstrates how to render a simple 2D triangle.

About GPU Family 4

Learn about A11 features, including raster order groups, tile shaders, and imageblocks.