Metal provides extremely efficient access to the GPU through its streamlined API and high-performance architecture. Check out what's new in Metal and dive into support for Metal on OS X. Understand enhancements to the Metal memory model and learn how to prepare assets for delivery in Metal-based apps and games.
RAV DHIRAJ: And the first
of two What's New in Metal sessions.
So we have a lot of Metal content to cover this week
across three sessions.
In fact, we've added so much new stuff to the API,
that we've decided to break the What's New
in Metal session into two parts.
So today, I'll be focusing on providing a high-level review
of the Metal ecosystem over the last 12 months.
Developers like you have already created some tremendous
applications using Metal.
I'll then talk about some of the new features
that we're introducing this year, and we'll end
with a specific example of how Metal integrates well
with the rest of the system
by describing a technology called app thinning.
In the second what's new in Metal session, Dan Omachi
and Anna Tikhonova will provide details
on a great new support library that we're introducing
in Metal this year or two great new support libraries rather,
MetalKit which provides convenience APIs that allow you
to create a great Metal application,
and Metal Performance Shaders our highly optimized library
of shaders that you can call directly from your application.
And finally in the last session, Phil Bennett will dive
into great techniques for taking advantage of --
for extracting the best possible performance
out of your Metal applications.
We'll be introducing our new GPU System Trace tool
in this session, so be sure to check it out.
We introduced Metal at WWDC last year for iOS 8.
Our goal was a ground-up reimplementation of our graphics
and compute APIs to give you the best possible performance
on the GPUs in our platform.
So we achieved this by getting much of the software between you
and the GPU out of your way.
To better illustrate this let's look back at an example
that we showed last year at WWDC that describes the work done
by the GPU and The CPU per frame.
In this example the top bar represents the time spent
by the CPU and the bottom bar represents the GPU time.
So as you can see, we're currently CPU bound
and the GPU is idle for part of the frame.
So with Metal we're able
to dramatically reduce the GPU API overhead
and effectively make the GPU the bottleneck in the great frame.
So the great thing is that this allows you to take advantage
of this additional CPU idle time to make your game better.
You can add more physics or AI for example,
or you can issue more draw calls
to increase the complexity of your scene.
But we didn't just stop there.
Metal also allows you to move expensive operations
like shader compilation and state validation from draw time
which happens many thousands of times per frame, to load time
which happens very infrequently, and even better in some cases
to build time, when your users don't see any impact at all.
Additionally, with iOS 8 we not only introduced compute
or exposed compute for the first time on our iOS devices,
but we also provided you with a cohesive interoperability story
between the graphics and compute APIs allowing
to you efficiently interleave render and compute operations
on Metal capable devices.
And finally, with Metal, your application is able
to make efficient use of multithreading
without the API getting in your way, allowing you
to encode for multiple threads.
And the results have been stunning.
So last year we showed you Epic's Zen Garden demo
where they used Metal to achieve ten times the number
of draw calls in the scene.
We also showed you EA's Plants Versus Zombies technology demo
where they used Metal to bring their console rendering engine
to the iOS platform.
Now this set a high bar for the development community.
And over the last year we've witnessed the release
of some truly astounding titles
that have taken great advantage of the Metal API.
Titles like the MOBA Vainglory by Super Evil Mega Corp
that used Metal to achieve 60 frames per second in their game.
Disney's Infinity: Toy Box 2, Metal enabled them
to bring their console, graphics,
and game play experience to iOS.
Gameloft Asphalt 8, they were able to improve their gameplay
by using Metal to render three times the number
of opponents in the game.
But it's not just about games.
With the new version of Pixelmator
for the iPhone they're using Metal
to accelerate image processing
in their powerful new distort tools.
In fact, the response has been overwhelming, with a number
of key content and game developers now adopting Metal
on OS X.
And much of this content has been enabled by our commitment
to bring the leading game console engines
to the iOS platform.
This includes Unity, Epic's Unreal Engine 4,
and EA's Frostbite mobile engine.
Last year we also showed you how Metal fits in the big picture
of how your application accesses the GPU.
On the one side we have our high-level 2D
and 3D scene graphs APIs
that give you incredible functionality and convenience.
And on the other side with Metal,
we provided a direct access path to the GPU.
So this gives you an amazing range to do what's right
for your application, and if you choose to use one
of the higher level APIs, the great thing is
that we can make improvements under the covers
and you can benefit from them
without us changing a single line of code.
Well this year, we're happy to announce
that we've done just that, and we're bringing the power
and efficiency of Metal to the system-wide technologies.
We really believe that this is going
to improve the user experience on our platforms.
This is also a great year for Metal capable devices.
The iPhone 5s and the iPad Air were the headliners
at WWDC last year, and with the introduction of the iPhone 6,
the 6+, and the iPad Air 2,
we now have an incredible install base
of Metal capable devices.
But of course we didn't just stop there.
We're happy to announce
that we're bringing Metal to the OS X platform.
We have broad support for Metal
across all our shipping configurations.
In fact, Metal is supported on all Macs introduced since 2012.
This of course means that we have support
for all three GPU venders: Intel, AMD, and Nvidia.
And the other big news is that we're bringing all the tools
that you're familiar with using on iOS
to the Mac platform as well.
This includes the Frame Debugger, the Shader Profiler,
and all our API analysis tools.
This is huge.
We understand the challenges of debugging complex graphics
and compute applications,
and think that these will be invaluable
in your development efforts on OS X.
And of course, all of this is available in the seed build
of OS X El Capitan that you can download today.
So Metal on OS X is the same API you're familiar with using
on iOS with a few key additions.
With new APIs to support device selection, discrete memory,
and new texture formats, Metal makes it incredibly easy for you
to bring your iOS applications to OS X.
And here are a few examples
of developers who've done exactly that.
So you heard in the keynote that we've been working with Epic
to bring their iOS Metal development code
to a Unreal Engine on the Mac.
Well, Epic used Metal and their deferred renderer
to create this amazing stylized look in Fortnite.
Additionally folks at Unity brought up their engine
and demonstrated their Viking Village demo
in only a few weeks.
It's really great to see this content on OS X on Metal.
And we've been working with a number
of additional Mac developers to enable them to access the power
of the GPU through Metal.
So you also heard in the keynote
about digital content creation applications.
Developers like Adobe, they're using Metal to access the GPU
to accelerate image processing.
The guys at The Foundry have also been using Metal
to accelerate their 3D modeling application MODO.
And here today to talk about their experience adopting Metal
in OS X is Jack Greasley from the The Foundry.
JACK GREASLEY: Thank you Rav.
Hi. I'm Jack Greasley.
I'm head of new technology at The Foundry.
And at The Foundry we create tools for digital artists.
Our software is used around the world in games, movies, TV,
and film, including some photo real Peruvian bears,
mutant monster hunters.
But it's not just about the virtual.
Some of our design customers like Adidas actually make things
and if you ask a designer they'll tell you
that any product is a result of thousands of little experiments.
We understand this process
and we create our tools specifically to support it.
MODO is our premier 3D modeling animation and rendering system.
It is used to make games, films, product design,
lots of different things.
Our users create stunning imagery and animations
for things both real and imaginary.
In our latest version of MODO 9.01,
we revamped out GPU renderer,
the aim was to provide a fluid interactive experience
with a highest possible quality to designers.
The benefit of this is that if your viewport is realtime you
can make tens of decisions in the time it would take
to do a single software render.
We had already done some early work with Metal in iOS,
but a couple of months ago we got a great opportunity
to start working with Metal on OS X.
So we put together a small team and set them a challenge,
we gave them four weeks to see how much
of the new MODO viewport they could bring over
and get running on Metal.
And we almost immediately got some stunning results.
Although it's only a small triangle,
it actually represents a huge milestone for us.
Once we did that, we were able to very,
very quickly make progress.
And our plan of attack was to really work from the bottom up,
and to start bringing the functionality
from our new viewport over onto Metal.
So here on day one we started with the environment.
We added a few more triangles.
A little bit of shading started
to make this look a little bit more like a real car.
Putting in the soft shadows
and specular highlights really added a little bit of bling,
and everybody loves shiny.
And so here we are, four weeks later,
and you remember that single triangle?
We got some incredible results.
Putting this all back into Metal gave us a fully functional view
port running inside of MODO on Metal in just four weeks.
One of the great things for us,
is this gives us a standardized renderer across iOS and OS X,
we created a WYSIWYG workflow between the two platforms.
So, what did we learn?
The first thing we learned is that working with Metal is fun.
I've spent 20 years working with OpenGL,
and I can tell you having a nice lean easy to use API is
like a breath of fresh air.
Secondly, the debugging and optimization tools
in Metal are absolutely fantastic.
As I have said, if you have done debugging on GPUs,
you know why this is important.
Metal can also be really fast.
In some of our tests, we got three times speed-up,
and that's using exactly the same data on the same GPU.
Going forward we have some big plans from our new viewport
and we're actually looking to integrate it into all
of our tools across The Foundry,
so hopefully we'll be seeing the Metal cropping
up in interesting places very, very soon.
So I'm going to hand you back to Rav, and thank you very much.
RAV DHIRAJ: Thank you Jack.
That was fantastic.
Okay, so I'd like to now talk about the new features
that we're introducing in iOS 9 and OS X El Capitan.
And there is a lot of them.
This is a just a selection
of the features we've added this year.
Now I don't have time to talk about every single one of these,
so I'm going to focus on a subset,
including GPU family sets, our new memory model,
texture barriers, our expanded texturing support.
Of course, as I mentioned before, you can learn more
about MetalKit, Metal Performance Shaders,
and our new Metal System Trace tool
in the sessions later this week.
So let's dive right into them.
I would like to start with the GPU, our Metal feature sets.
Metal defines collections of features that are specific
to generations of GPU hardware.
Metal calls these GPU families.
So a GPU feature set is defined by the platform, iOS, or OS X,
the Family Name, which is specific
to a hardware generation, and a version that allows us
to augment the feature set over time.
It's really trivial to query the feature set,
simply call supportFeatureSet on your Metal device to determine
if that GPU family is supported.
So here is our iOS feature set matrix.
Now you'll notice that we have support
for two major GPU families and versioning to differentiate
between our iOS 8 and our iOS 9 features.
On OS X the GPUFamily1 v1 feature set represents the
features that we're going to be shipping in OS X El Capitan.
This defines the base for a Metal capable device
on the desktop platform.
Now I would like to talk
about two new shader constant updates APIs that we're adding.
First a little bit of background.
So for every draw that you encode
into command buffer there's some constant data you need
to send to the shader.
Now it will be incredibly inefficient for you
to have a separate constant buffer per draw
so generally most Metal applications allocate a single
constant buffer that they have per frame.
They then append the constant data into the buffer
as they encode their draws.
So what does the code look like?
Well, first we have some setup
for the constant buffer and the data.
Then just like in that diagram, within your draw loop,
you send in the new constant data
or you pen the new constant data into your constant buffer.
Now it's worth noting that the setVertexBuffer call here is
actually doing two things.
It's setting the constant buffer,
and it is updating the offset in it.
Now, of these two operations, it's that call
to set the constant buffer that's the most expensive.
So Metal now has APIs that allows you
to separate these two operations and move that expensive call
to set the constant buffer, or the vertex buffer,
outside of your draw loop.
If you have thousands of draw calls per frame,
this can be a significant savings.
But if you only have a small amount of constant data,
it might be more efficient for Metal
to manage the constant buffer for you.
So Metal now has the setVertexBytes API,
you can use this to append new constants at every draw call.
Actually there is one more thing I want to say about that.
So that API is great if you only have a small number of constants
as I said, and that's tens of bytes of constants.
If you have larger constant sets you really want
to use on of the other APIs.
There's a good chance that it'll be way more performant.
All right let me talk about the new memory model now.
So, our goal with the new memory model was
to support both unified and discrete memory systems
without you having to make much change.
Now Metal supports discrete memory now,
and that's high speed memory that the GPU
on some desktops have access to.
So the way we've achieved this is
by introducing new storage modes that allow you to specify
where the resource will reside in memory.
So the modes are shared, private, and managed.
So I'll talk about each of these in turn
over the next few slides.
Let's start by talking about the shared storage mode.
So this is the mode that's
in the existing implementation of iOS 8.
So in a unified memory system, the memory that you use
to store a buffer or a texture,
is shared between the CPU and the GPU.
There's only one copy of the memory
and the memory is coherent at command buffer boundaries.
So this means that you have to just be done
with the GPU before accessing it with the CPU.
This makes it very easy to use.
But now new in iOS 9 and in OS X El Capitan we're introducing the
private storage mode.
So private memory can only be accessed by the GPU
through render, compute, or blit operations.
The advantage of private memory is performance.
Metal can store the data in a way that's more optimal
for the GPU to access, by using frame buffer compression
Private storage mode also works really well
with discrete memory systems, and you can put your resources
into the memory that the GPU has the fastest access to.
And now new in OS X, only,
we're introducing the managed storage mode.
With managed memory the resource has storage
in both the discrete memory and the system memory,
and Metal manages the coherency between those two copies.
Now this gives you the convenience and flexibility
of the shared storage mode, and in most cases, the performance
of the private storage mode.
And if you have a desktop system with a unified memory system,
you don't have to worry
about managed having any extra overhead.
There's only one copy of the resource that Metal maintains.
So there are a couple other considerations
with managed resources if you're going to modify the data
with a CPU or the GPU.
So first, if you modifying the data with a CPU you have
to let Metal know by calling the buffer didModifyRange
or the texture replaceRegion APIs.
Likewise, if you want to read the data back,
you'll need to call the synchronizeResource API.
It's also worth noting that you need to wait for the operation
to be complete before you actually read the data
with the CPU.
So let's look back at that shader constant update example I
So this example is currently using shared memory.
So in a discrete memory system, you ideally want the constants
to be in discrete memory.
Now you could possibly do this with a private buffer,
but you'd have to manage the transfer to that buffer.
It's actually a lot simpler to use managed buffers,
it makes it really easy.
There is only two things you need to do.
First, you have to specify the managed storage mode
when you create the constant buffer, and then,
you have to call didModifyRange to tell Metal
that you've now updated the constants with the CPU.
And that's it.
The rest of the code remains exactly the same.
It's worth noting that buffers are shared
by default on all platforms.
On iOS, textures are shared by default as well,
but on OS X we chose to make the default mode
for textures managed, because it allows you
to write portable code without sacrificing performance.
But there are some cases where you don't want
to use a managed texture.
This is one of them.
When you have a frame buffer or a renderable texture,
then you want to use the private storage mode
to get the best performance.
This is particularly important
if the GPU is the only one accessing the data.
And that's our new memory model in Metal.
I'd like to talk about two new features in Metal
that are specific to OS X that I think you'll really like.
The first is layered rendering.
So the intent of this API is for you to be able to render
to a specific layer of a texture
for every triangle that you draw.
So this could be the slice of an array texture,
the plane of a 3D texture, or the face of a cube texture.
So on a per triangle basis, you can specify which layer
to render into by simply specifying the array index
in your vertex shader.
The game Fortnite used this very technique to render
into the faces of a cube map for some
of their environmental lighting.
So we think you'll find this equally useful.
The second feature that's specific
to OS X is texture barriers.
So by default GPUs tend to overlap the execution
of their draw calls, and you can't reliably use the output
of one draw call in a subsequent one, without some form
of explicit synchronization.
Metal now has an API that allows you to insert a barrier
between these draw calls.
So this is critical
for implementing efficient programmable blending on OS X.
The API is really easy to use.
Simply insert the barrier between the draw operations
that you want to synchronize.
And last, but certainly not least, I want to talk
about our expanded texturing support in Metal this year.
By default the max limits for all textures
in iOS have been increased to 8k.
We've also added cube array support on OS X,
and bumped up the quality of anti-aliasing
across the board in all our platforms.
We've also significantly flushed out the pixel formats
that you can write to, or read from, a compute shader.
Also new is a texture usage property.
So this allows you to tag textures
to tell Metal how you plan on using them, and Metal will try
to optimize for that usage.
So for example, if you have a renderable texture,
you want to set the renderTarget and shaderRead flags.
And this will tell Metal that you plan on both rendering
to that texture, and then sampling from it.
By default, the usage is unknown
and Metal won't make any assumptions about how
that texture is used, allowing Metal
to use it anywhere in the system.
Unlike iOS, the desktop GPUs prefer
to have a single depth stencil texture,
so we've added two new combined depth stencil formats.
The 32-8 format is supported
on all our hardware, both iOS and OS X.
The 24-8 format, however, is only supported on some.
So if it means your precision requirements,
you'll have to check if it is available.
So let's talk about texture compression.
So the type of compression format you use depends
on the device you're targeting and the type
of data you're encoding.
On iOS we support a number of these formats including PVRTC,
ETC2, and EAC, and new for GPUFamily2,
we're also supporting ASTC.
So ASTC has a very high quality compression, better than PVRTC
and ETC at the same equivalent size.
It also allows you to encode a number of different formats
from photographic content, height maps,
normal maps, and many more.
It also provides a very fine-grinned control
between size and quality,
offering between 1 and 8 bits per pixel.
And at the low end, this is half the storage required for PVRTC.
And finally, as I previously noted, this is only available
on GPUFamily2 capable devices,
so you'll have to look out for that.
Finally, in OS X we're introducing all the native
texture compression formats that the desktop GPUs support.
Now, these BCn formats should all be familiar to you.
If you've worked on a desktop platform or game console before,
you likely have assets that are already in this format.
And that's our expanded texturing support
and the features that I'm going to cover today.
So I'd like to change topics and talk
about a new technology called app thinning.
You might have heard about it in the last talk.
So this is not specifically a Metal feature, but it does rely
on the GPU families I discussed earlier in the session.
First, to set context, the typical game development
and deployment flow for a developer
on our platform looks something like this.
You generally have an art pipeline
that generates some assets.
The assets are built via Xcode or your custom tools pipeline
into a binary and then that binary is sent up somewhere
to the App Store, and that specific
or that very same binary is deployed
to the devices of all your users.
So this works great.
But as soon as you start having assets that are specific
to the capability of a device,
you start running into some issues.
For example, if you have assets that are specific
to Metal devices, and assets specific to legacy devices,
you currently have to download both versions
to all your users' devices.
So obviously this is not ideal.
App thinning let's you solve this problem by allowing you
to tag assets by capability
and then only the asset that's needed
for the device is actually downloaded to the device.
So how do we do this?
Well, app thinning allows you to define capability
across two axis, GPUFamily version and device memory size.
Now this creates a matrix that you can then use
to target specific devices.
So let's look at a typical normal map example.
Ideally you want to store the normal maps compressed,
and EAC is a great format for that.
But since some legacy devices don't support compressed
textures or EAC in particular, you probably want
to have an uncompressed version of the asset as well.
So app thinning allows you to tag these assets
and only download the compressed one to the Metal capable device,
and the uncompressed version to the legacy device.
But app thinning is actually a lot more capable than this.
So let's extend our example.
And that's support for more devices.
So in this particular case we're going to create five assets.
We'll start with a high resolution ASTC version
for our most capable 2GB device.
And then we'll include a slightly lower resolution
for the 1GB version of that device.
And since some Metal capable devices don't support ASTC,
we'll include the EAC version as well.
And then, for your legacy devices,
we have the uncompressed version of the asset.
And we can extend this example even further
by including a lower resolution version
of that uncompressed asset for our least capable device,
the 512MB configuration.
So you probably don't want to create five assets
of everything, but the point I'm trying to make here is
that you have tremendous amount of flexibility
to target specific devices
and create the best experience possible for your users.
So Xcode integrates a great new UI that allows you
to tag assets in this way.
The first thing you need to do, is define the capabilities
of the devices that you'd like to target.
That creates that little matrix, and then all you have
to do is drop in the assets that match the relevant intersection
of GPUFamily and device memory size
that you're trying to target.
It's that simple.
But of course, we realize
that not all developers have tools pipelines that exist
in Xcode, so we have you covered there as well.
We're supporting with app thinning the JSON file format
that lets you specify your asset catalogues.
So just like in Xcode, you have to specify the GPUFamily version
and the device size for each asset you want
to include in your catalog.
So once you have your asset catalogs defined,
how do you retrieve the data at runtime?
The answer is the NSDataAsset class,
it provides the resource matched to the capabilities
of the device that you're running on.
Using the NSDataAsset is really easy.
Simply allocate an NSDataAsset object using the name
that you assigned in your asset catalog,
and then use it in your data.
So let's tie this altogether using the diagram
that I originally showed and the normal map example.
So in this case, your artist will create a bunch
of normal maps, some compressed, some uncompressed
to target the devices you'd like.
This gets built via Xcode or your custom tools pipeline
into your binary, big massive binary with lots of assets
in it, gets uploaded to the App Store,
and then the great thing is only the normal map required
by your user, is downloaded to their device.
And that's app thinning.
We think that this is going to change the way you create
and deploy content on Metal capable devices.
So that was a whirlwind tour of the Metal ecosystem
over the last 12 months.
We've seen developers like you create amazing content
We have brought Metal to OS X.
We've also brought all our great Metal GPU tools to OS X.
We've introduced some powerful new APIs
that we think you're going to love.
Finally, we also talked about how Metal integrates well
with a system, by talking about app thinning.
All in all, it's been a great year,
and we're really looking forward to seeing what you're going
to be able to build with Metal over the next one.
So please visit our online documentation, and you'd also
like to go to our support forums,
and if those don't answer your questions, you're of course free
to reach out to Allan Schaffer our Game
We have two more sessions this week, What's New in Metal,
Part 2 on Thursday morning,
and Metal Performance Optimization Techniques,
which is on Friday.
Please be sure to visit those.
Thank you very much.
Looking for something specific? Enter a topic above and jump straight to the good stuff.
An error occurred when submitting your query. Please check your Internet connection and try again.