What's New in iOS 9 and OS X 10.11

This chapter summarizes the new features introduced in iOS 9 and OS X 10.11.

Feature Sets

All devices that support Metal conform to a feature set value described in Listing 10-1.

Listing 10-1  Metal Feature Sets

typedef NS_ENUM(NSUInteger, MTLFeatureSet)
{
    MTLFeatureSet_iOS_GPUFamily1_v1 = 0,
    MTLFeatureSet_iOS_GPUFamily2_v1 = 1,
    MTLFeatureSet_iOS_GPUFamily1_v2 = 2,
    MTLFeatureSet_iOS_GPUFamily2_v2 = 3,
    MTLFeatureSet_iOS_GPUFamily3_v1 = 4,
    MTLFeatureSet_OSX_GPUFamily1_v1 = 10000
};

All OS X devices that support Metal support the OSX_GPUFamily1_v1 feature set.

iOS devices that support Metal support a feature set determined by their GPU and OS versions. See the MTLFeatureSet reference and iOS Device Compatibility Reference for more information.

To find out which feature set is supported by a device, query the supportsFeatureSet: method of a MTLDevice object.

Device Selection

Use the MTLCreateSystemDefaultDevice function to obtain the preferred GPU for your app. OSX_GPUFamily1_v1 feature set devices may have multiple GPUs, which you can obtain by calling the MTLCopyAllDevices function. To further obtain the characteristics of a single GPU in a multi-GPU system, query the headless property to find out if the GPU is attached to a display and query the lowPower property to find the lower-power GPU in an automatic graphics switching system.

You can query specific render and compute characteristics with the new supportsTextureSampleCount: method and maxThreadsPerThreadgroup property added to the MTLDevice protocol.

Resource Storage Modes and Device Memory Models

The OSX_GPUFamily1_v1 feature set includes support for managing resources on GPUs that contain discrete memory. Resource memory allocation is explicitly handled by selecting an appropriate storage mode value from the Listing 10-2 enum for textures and Listing 10-3 enum for buffers.

Listing 10-2  Texture Storage Modes

typedef NS_ENUM(NSUInteger, MTLStorageMode)
{
    MTLStorageModeShared  = 0,
    MTLStorageModeManaged = 1,
    MTLStorageModePrivate = 2,
};

Listing 10-3  Buffer Storage Modes

#define MTLResourceStorageModeShift  4
 
typedef NS_ENUM(NSUInteger, MTLResourceOptions)
{
    MTLResourceStorageModeShared  = MTLStorageModeShared  << MTLResourceStorageModeShift,
    MTLResourceStorageModeManaged = MTLStorageModeManaged << MTLResourceStorageModeShift,
    MTLResourceStorageModePrivate = MTLStorageModePrivate << MTLResourceStorageModeShift,
};

iOS feature sets only support shared and private storage modes. A description of the three new resource storage modes is summarized in the following sub-sections.

Shared

Resources allocated with the shared storage mode are stored in memory that is accessible to both the CPU and the GPU. On devices with discrete memory, these resources are accessed directly from CPU local memory rather than being copied to GPU memory.

In iOS feature sets, this is the default storage mode for textures. In the OSX_GPUFamily1_v1 feature set, textures cannot be allocated with the shared storage mode.

Private

Resources allocated with the private storage mode are stored in memory that is only accessible to the GPU.

Buffers and textures allocated with private storage mode can only be accessed by the GPU. The contents property of a buffer returns NULL and the MTLTexture methods listed below are illegal to call:

To access a private resource, your app may perform one or more of the following actions:

  • Blit to or from the private resource.

  • Read from the private resource from any shader function.

  • Render to the private resource from a fragment shader.

  • Write to the private resource from a compute function.

Managed

On GPUs without discrete memory, managed resources have only a single memory allocation accessible to both the CPU and GPU. On GPUs with discrete memory, managed resources internally allocate both CPU-accessible and GPU-accessible memory.

Managed textures are not available in iOS feature sets; use MTLStorageModeShared instead.

Any time your app uses the CPU to directly modify the contents of a buffer, you must make a call to the didModifyRange: method to notify Metal that you have changed the contents of the resource indicated by the specific range.

If you use the GPU to modify the contents of a managed resource and you wish to access the results with the CPU, then you must first synchronize the resource using the synchronizeResource: method or synchronizeTexture:slice:level: method.

Choosing a Resource Storage Mode

Generally, there are four main scenarios to consider when choosing a storage mode:

Setting and Querying a Resource Storage Mode

For textures, set the desired storage mode with the storageMode property of a MTLTextureDescriptor object. The default storage mode for textures is MTLStorageModeShared in iOS feature sets and MTLStorageModeManaged in the OSX_GPUFamily1_v1 feature set.

For buffers, set the desired storage mode by passing in the respective MTLResourceOptions value in any of the newBufferWithLength:options:, newBufferWithBytes:length:options:, or newBufferWithBytesNoCopy:length:options:deallocator: methods of MTLDevice. The default storage mode for buffers is MTLStorageModeShared, but OSX_GPUFamily1_v1 feature set apps may benefit from increased performance by explicitly managing their buffers with the managed or private storage modes.

The storage mode of a resource, for either a texture or a buffer, can be queried with the storageMode property of a MTLResource object.

Textures

Hardware support for different texture types and pixel formats is a key capability difference between feature sets. This section lists the major texture additions to the framework; for a more detailed discussion, see the code listings and comparison tables in the MTLPixelFormat reference and Metal Feature Set Tables chapter.

Compressed Textures

The iOS_GPUFamily2_v1, iOS_GPUFamily2_v2, and iOS_GPUFamily3_v1 feature sets add support for ASTC textures. These pixel formats are listed in Listing 10-4.

Listing 10-4  ASTC Pixel Formats

typedef NS_ENUM(NSUInteger, MTLPixelFormat)
{
    // ASTC
    MTLPixelFormatASTC_4x4_sRGB      = 186,
    MTLPixelFormatASTC_5x4_sRGB      = 187,
    MTLPixelFormatASTC_5x5_sRGB      = 188,
    MTLPixelFormatASTC_6x5_sRGB      = 189,
    MTLPixelFormatASTC_6x6_sRGB      = 190,
    MTLPixelFormatASTC_8x5_sRGB      = 192,
    MTLPixelFormatASTC_8x6_sRGB      = 193,
    MTLPixelFormatASTC_8x8_sRGB      = 194,
    MTLPixelFormatASTC_10x5_sRGB     = 195,
    MTLPixelFormatASTC_10x6_sRGB     = 196,
    MTLPixelFormatASTC_10x8_sRGB     = 197,
    MTLPixelFormatASTC_10x10_sRGB    = 198,
    MTLPixelFormatASTC_12x10_sRGB    = 199,
    MTLPixelFormatASTC_12x12_sRGB    = 200,
    MTLPixelFormatASTC_4x4_LDR       = 204,
    MTLPixelFormatASTC_5x4_LDR       = 205,
    MTLPixelFormatASTC_5x5_LDR       = 206,
    MTLPixelFormatASTC_6x5_LDR       = 207,
    MTLPixelFormatASTC_6x6_LDR       = 208,
    MTLPixelFormatASTC_8x5_LDR       = 210,
    MTLPixelFormatASTC_8x6_LDR       = 211,
    MTLPixelFormatASTC_8x8_LDR       = 212,
    MTLPixelFormatASTC_10x5_LDR      = 213,
    MTLPixelFormatASTC_10x6_LDR      = 214,
    MTLPixelFormatASTC_10x8_LDR      = 215,
    MTLPixelFormatASTC_10x10_LDR     = 216,
    MTLPixelFormatASTC_12x10_LDR     = 217,
    MTLPixelFormatASTC_12x12_LDR     = 218,
};

The OSX_GPUFamily1_v1 feature set supports BC textures instead. The new pixel formats are listed in Listing 10-5.

Listing 10-5  BC Pixel Formats

typedef NS_ENUM(NSUInteger, MTLPixelFormat)
{
    // BC1, BC2, BC3 (aka S3TC/DXT)
    MTLPixelFormatBC1_RGBA           = 130,
    MTLPixelFormatBC1_RGBA_sRGB      = 131,
    MTLPixelFormatBC2_RGBA           = 132,
    MTLPixelFormatBC2_RGBA_sRGB      = 133,
    MTLPixelFormatBC3_RGBA           = 134,
    MTLPixelFormatBC3_RGBA_sRGB      = 135,
 
    // BC4, BC5 (aka RGTC)
    MTLPixelFormatBC4_RUnorm         = 140,
    MTLPixelFormatBC4_RSnorm         = 141,
    MTLPixelFormatBC5_RGUnorm        = 142,
    MTLPixelFormatBC5_RGSnorm        = 143,
 
    // BC6H, BC7 (aka BPTC)
    MTLPixelFormatBC6H_RGBFloat      = 150,
    MTLPixelFormatBC6H_RGBUfloat     = 151,
    MTLPixelFormatBC7_RGBAUnorm      = 152,
    MTLPixelFormatBC7_RGBAUnorm_sRGB = 153,
};

PVRTC Blit Operations

The MTLBlitCommandEncoder protocol contains a new MTLBlitOption value that enables copying to or from a texture with a PVRTC pixel format. Two new methods support these operations by passing the MTLBlitOptionRowLinearPVRTC value into their options parameter:

PVRTC blocks are arranged linearly in memory in row-major order, similar to all other compressed texture formats. PVRTC pixel formats are only available in iOS feature sets.

Depth/Stencil Render Targets

The OSX_GPUFamily1_v1 feature set does not support separate depth and stencil render targets. If these render targets are needed, use one of the newly introduced depth/stencil pixel formats to set the same texture as both the depth and stencil render target. The combined depth/stencil pixel formats are listed in Listing 10-6.

Listing 10-6  Depth/Stencil Pixel Formats

typedef NS_ENUM(NSUInteger, MTLPixelFormat)
{
    // Depth/Stencil
    MTLPixelFormatDepth24Unorm_Stencil8 = 255,
    MTLPixelFormatDepth32Float_Stencil8 = 260,
};

All feature sets support the MTLPixelFormatDepth32Float_Stencil8 pixel format. Only some devices that support the OSX_GPUFamily1_v1 feature set also support the MTLPixelFormatDepth24Unorm_Stencil8 pixel format. Query the depth24Stencil8PixelFormatSupported property of a MTLDevice object to determine whether the pixel format is supported or not.

Textures with a depth, stencil, or depth/stencil pixel format can only be allocated with the private storage mode. To load or save the contents of these textures, you must perform a blit operation. The MTLBlitCommandEncoder protocol contains the new MTLBlitOption values that define the behavior of a blit operation for textures with a depth, stencil, or depth/stencil pixel format:

Two new methods support these operations with their options parameter:

Cube Array Textures

The OSX_GPUFamily1_v1 feature set adds support for cube array textures with the MTLTextureTypeCubeArray texture type value. Similarly, support for this texture type has been added to the Metal shading language with the texturecube_array type. The maximum length of a cube array texture is 341 (2048 divided by 6 cube faces).

Texture Usage

The MTLTextureDescriptor class contains the new usage property that allows you to declare how a texture will be used in your app. Multiple MTLTextureUsage values may be combined with a bitwise OR (|) if the texture will serve multiple uses over its lifetime. A MTLTexture object can only be used in the ways specified by its usage value(s) (an error will occur otherwise). The MTLTextureUsage options and their intended use are described as follows:

Specifying and adhering to an appropriate texture usage allows Metal to optimize GPU operations for a given texture. For example—set the descriptor’s usage value to MTLTextureUsageRenderTarget if you intend to use the resulting texture as a render target. This may significantly improve your app’s performance (with certain hardware).

If you don’t know what a texture will be used for, set the descriptor’s usage value to MTLTextureUsageUnknown. This value allows your newly created texture to be used everywhere, but Metal will not be able to optimize its use in your app.

Listing 10-7 shows you how to create a texture with multiple uses.

Listing 10-7  Specifying a Texture’s Usage

MTLTextureDescriptor* textureDescriptor = [[MTLTextureDescriptor alloc] init];
textureDescriptor.usage = MTLTextureUsageRenderTarget | MTLTextureUsageShaderRead;
// set additional properties
 
id <MTLTexture> texture = [self.device newTextureWithDescriptor:textureDescriptor];
// use the texture in a color render target
// sample from the texture in a fragment shader

Detailed Texture Views

The MTLTexture protocol adds the extended newTextureViewWithPixelFormat:textureType:levels:slices: method that allows you to specify a new texture type, base level range, and base slice range for a new texture view (in addition to the pixel format parameter that was already supported by the newTextureViewWithPixelFormat: method). Textures created with these texture view methods can now query their parent texture’s attributes with the parentTexture, parentRelativeLevel, and parentRelativeSlice properties (in addition to querying other attributes already supported by the pixelFormat and textureType properties). For details on texture view creation restrictions, such as valid casting targets, see MTLTexture Protocol Reference.

IOSurface Support

The OSX_GPUFamily1_v1 feature set adds support for IOSurfaces. Use the newTextureWithDescriptor:iosurface:plane: method to create a new texture from an existing IOSurface.

Render Additions

Render Command Encoder

There are several graphics API additions to the MTLRenderCommandEncoder class. The main new features are summarized below:

  • The method setStencilFrontReferenceValue:backReferenceValue: allows front-facing and back-facing primitives to use different stencil test reference values.

  • Depth clipping is supported with the MTLDepthClipMode enum and setDepthClipMode: method.

  • Counting occlusion query is supported in the OSX_GPUFamily1_v1 and iOS_GPUFamily3_v1 feature sets with the new MTLVisibilityResultModeCounting value that can be passed into the setVisibilityResultMode:offset: method.

  • Texture barriers are supported in the OSX_GPUFamily1_v1 feature set by calling the textureBarrier method between same-texture write and read operations.

  • Base vertex and base instance values are supported in the OSX_GPUFamily1_v1 and iOS_GPUFamily3_v1 feature sets with the drawPrimitives:vertexStart:vertexCount:instanceCount:baseInstance: and drawIndexedPrimitives:indexCount:indexType:indexBuffer:indexBufferOffset:instanceCount:baseVertex:baseInstance: methods. Similarly, support for these drawing inputs has been added to the Metal shading language by providing the new [[ base_vertex ]] and [[ base_instance ]] vertex shader inputs.

  • Indirect drawing is supported in the OSX_GPUFamily1_v1 and iOS_GPUFamily3_v1 feature sets with the argument structures listed in Listing 10-8. Use these structs alongside the drawPrimitives:indirectBuffer:indirectBufferOffset: and drawIndexedPrimitives:indexType:indexBuffer:indexBufferOffset:indirectBuffer:indirectBufferOffset: methods, respectively.



    Listing 10-8  Indirect Drawing Argument Structures

    // Without an index list
    typedef struct {
        uint32_t vertexCount;
        uint32_t instanceCount;
        uint32_t vertexStart;
        uint32_t baseInstance;
    } MTLDrawPrimitivesIndirectArguments;
     
    // With an index list
    typedef struct {
        uint32_t indexCount;
        uint32_t instanceCount;
        uint32_t indexStart;
        int32_t  baseVertex;
        uint32_t baseInstance;
    } MTLDrawIndexedPrimitivesIndirectArguments;
  • Shader constant updates can now be performed more efficiently by setting a vertex buffer once and then simply updating its offset inside your draw loop, as shown in Listing 10-9. If your app has a very small amount of constant data (tens of bytes) you can instead use the setVertexBytes:length:atIndex: method so the Metal framework can manage the constant buffer for you.



    Listing 10-9  Shader Constant Updates

    id <MTLBuffer> constant_buffer = // initialize buffer
    MyConstants* constant_ptr = constant_buffer.contents;
    [renderpass setVertexBuffer:constant_buffer offset:0 atIndex:0];
     
    for (i=0; i<draw_count; i++)
    {
        constant_ptr[i] = // write constants directly into the buffer
        [renderpass setVertexBufferOffset:i*sizeof(MyConstants) atIndex:0];
        // draw
    }

Layered Rendering

The OSX_GPUFamily1_v1 feature set adds support for layered rendering with new APIs in the MTLRenderPassDescriptor and MTLRenderPipelineDescriptor classes. Layered rendering enables a vertex shader to render each primitive to a layer of a texture array, cube texture, or 3D texture, specified by the vertex shader of its first vertex. For a 2D texture array or a cube texture, each slice is layer; for a 3D texture, each depth plane of pixels is a layer. Load and store actions apply to every layer of the render target.

To enable layered rendering, you must configure your render pass and render pipeline descriptors appropriately:

  1. Set the value of the renderTargetArrayLength property to specify the minimum number of layers available across all render targets. For example—set this value to 6 if you have a 2D texture array and a cube texture, each with a minimum of 6 layers.

  2. Set the value of the inputPrimitiveTopology property to specify the primitive type being rendered. For example—set this value to MTLPrimitiveTopologyClassTriangle for cube-based shadow mapping. The full enum declaration containing all primitive type values is listed in Listing 10-10

  3. Additionally, the value of sampleCount must be 1 (this is the default value).

Listing 10-10  Primitive Topology Values

typedef NS_ENUM(NSUInteger, MTLPrimitiveTopologyClass)
{
    MTLPrimitiveTopologyClassUnspecified = 0,
    MTLPrimitiveTopologyClassPoint = 1,
    MTLPrimitiveTopologyClassLine = 2,
    MTLPrimitiveTopologyClassTriangle = 3
};

Compute Additions

There are also a few compute API additions to both existing and new classes, as summarized below:

Supporting Frameworks

Metal provides two new frameworks that help you build Metal apps in a much easier and more powerful way.

MetalKit

The MetalKit framework provides a set of utility functions and classes that reduce the effort required to create a Metal app. MetalKit provides development support for three key areas:

  • Texture loading helps your app easily and asynchronously load textures from a variety of sources. Common file formats such as PNG and JPEG are supported, as well as texture-specific formats such as KTX and PVR.

  • Model handling provides Metal-specific functionality that makes it easy to interface with Model I/O assets. Use these highly-optimized functions and objects to transfer data efficiently between Model I/O meshes and Metal buffers.

  • View management provides a standard implementation of a Metal view that drastically reduces the amount of code needed to create a graphics-rendering app.

The MetalKit framework is available in all Metal feature sets. To learn more about the MetalKit APIs, see MetalKit Framework Reference.

Metal Performance Shaders

The Metal Performance Shaders framework provides highly-optimized compute and graphics shaders that are designed to integrate easily and efficiently into your Metal app.

Use the Metal Performance Shader classes to achieve optimal performance for all supported devices, without having to target or update your shader code for specific GPU families. Metal Performance Shader objects fit seamlessly into your Metal app and can be used with resource objects such as buffers and textures.

Common shaders provided by the Metal Performance Shader framework include:

  • Gaussian blur.

  • Image histogram.

  • Sobel edge detection.

The Metal Performance Shaders framework is available in the iOS_GPUFamily2_v1, iOS_GPUFamily2_v2, and iOS_GPUFamily3_v1 feature sets. To learn more about the Metal Performance Shaders APIs, see Metal Performance Shaders Framework Reference.