iOS, macOS, watchOS and tvOS offer a rich set of tools and APIs for recording, processing, and playing back audio in your apps. Learn how to choose the right API for your app, and the details on implementing each of them in order to deliver an outstanding audio experience.
[ Music ]
Good afternoon, everyone.
So, how many of you want to build an app
with really cool audio effects but thought
that that might be hard?
Or how many of you want to focus more
on your application's overall user experience but ended
up spending a little more time on audio?
Well, we've been working hard to make it easy for you.
My name is Saleem.
I'm a Craftsman on the Core Audio Team.
I want to welcome you to today's session
on delivering an exceptional audio experience.
So let's look at an overview of what's in our stack today.
We'll start with our AVFoundation framework.
We have a wide variety of high-level APIs
that let you simply play and record audio.
For more advanced use cases, we have our AudioToolbox framework.
And you may have heard of AudioUnits.
These are a fundamental building block.
If you have to work with MIDI devices or MIDI data,
we have our CoreMIDI framework.
For game development, there's OpenAL.
And over the last two years,
we've been adding many new APIs and features as well.
So you can see there are many ways
that you can use audio in your application.
So, our goal today is to help guide you
to choosing the right API for your application's needs.
But don't worry, we also have a few new things
to share with you as well.
So, on the agenda today, we'll first look
at some essential setup steps for a few of our platforms.
Then, we'll dive straight into simple and advanced playback
and recording scenarios.
We'll talk a bit about multichannel audio.
And then later in the presentation,
we'll look at real-time audio --
how you can build your own effects, instruments,
and generators -- and then we'll wrap up with MIDI.
So, let's get started.
iOS, watchOS, and tvOS all have really rich audio features
and numerous writing capabilities.
So users can make calls, play music, play games,
work with various productivity apps.
And they can do all of this mixed in or independently.
So the operating system manages a lot of default audio behaviors
in order to provide a consistent user experience.
So let's look at a diagram showing how audio is a
So you have your device, and it has a couple
of inputs and outputs.
And then there's the operating system.
It may be hosting many apps, some of which are using audio.
And lastly, there's your application.
So AVAudioSession is your interface, as a developer,
for expressing your application needs to the system.
Let's go into a bit more detail about that.
Categories express the application's
We have modes and category options
which help you further customize and specialize your application.
If you're into some more advanced use cases,
such as input selection, you may want to be able
to choose the front microphone
on your iPhone instead of the bottom.
If you're working with multichannel audio
and multichannel content on tvOS, you may be interested
in things like channel count.
If you had a USB audio device connected to your iPhone,
you may be interested in things like sample rate.
So when your application is ready and configured
to use audio, it informs the system to apply the session.
So this will configure the device's hardware
for your application's needs and may actually result
in interrupting other audio applications on the system,
mixing with them, and/or ducking their volume level.
So let's look at some of the essential steps
when working with AVAudioSession.
The first step is to sign up for notifications.
And the three most important notifications are the
interruption, route change,
and mediaServicesWereReset notification.
You can sign up for these notifications before you
activate your session.
And in a few slides, I'll show you how you can manage them.
Next, based on your application's high-level needs,
you'll want to set the appropriate category mode
So, let's look at a few examples.
Let's just say I was building a productivity app.
And in that application, I want to play a simple sound
when the user saves their document.
Here, we can see that audio enhances the experience
but it's not necessarily required.
So, in this case, I'd want to use the AmbientCategory.
This category obeys the ringer switch.
It does not play audio in the background,
and it'll always mix in with others.
If I was building a podcast app,
I'd want to use the PlaybackCategory,
the SpokenAudio mode.
And here, we can see that this app location will interrupt
other applications on the system.
Now if you want your audio to continue playing
in the background, you'll also have
to specify the background audio key in your info.plist.
And this is essentially a session property as well.
It's just expressed through a different means.
For your navigation app,
let's look at how you can configure the navigation prompt.
Here, you'd want to use the PlaybackCategory,
And there are a few options of interest here.
You'd want to use both the InterruptSpokenAudio
AndMixWithOthers as well as the duckOthers.
So, if you're listening to a podcast while navigating
and that navigation prompt comes up saying, "Oh,
turn left in 500 feet,"
it'll actually interrupt the podcast app.
If you're listening to music,
it'll duck the music's volume level and mix in with it.
For this application, you'll also want
to use a background audio key as well.
So, next, let's look at how we can manage activation
of our session.
So what does it mean to go active?
Activating your session informs the system
to configure the hardware for your application's needs.
So let's say, for example,
I had an application whose category was set
When I active my session, it'll configure the hardware
to use input and output.
Now, what happens if I activate my session while listening
to music from the music app?
Here, we can see that the current state
of the system is set for playback only.
So, when I activate my session, I inform the system
to configure the hardware for both input and output.
And since I'm in a non-mixable app location,
I've interrupted the music app.
So let's just say my application makes a quick recording.
Once I'm done, I deactivate my session.
And if I choose to notify others
that I've deactivated my session,
we'll see that the music app would resume playback.
Next, let's look at how we can handle the notifications we
signed up for.
We'll first look at the interruption notification,
and we'll examine a case
where your application does not have playback UI.
The first thing I do is I get the interruptionType.
And if it's the beginning of an interruption,
your session is already inactive.
So your players have been paused, and you'll use this time
to update any internal state that you have.
When you receive the end interruption, you go ahead
and activate your session, start your players,
and update your internal state.
Now, let's see how that differs
for an application that has playback UI.
So when you receive the begin interruption --
again, your session is inactive --
you update the internal state, as well as your UI this time.
So if you have a Play/Pause button, you'd want to go ahead
and set that to "play" at this time.
And now when you receive the end interruption, you should check
and see if the shouldResume option was passed in.
If that was passed in, then you can go ahead
and activate your session, start playback,
and update your internal state and UI.
If it wasn't passed in, you should wait
until the user explicitly resumes playback.
It's important to note that you can have
So, not every begin interruption is followed by a matching end.
And an example of this are media-player applications
that interrupt each other.
Now, let's look at how we can handle route changes.
Route changes happen for a number of reasons --
the connected devices may have changed,
a category may have changed,
you may have selected a different data source or port.
So, the first thing you do is you get the routeChangeReason.
If you receive a reason that the old device is unavailable
in your media-playback app, you should go ahead
and stop playback at this time.
An example of this is if your user is streaming music
to the headsets and they unplug the headsets.
They don't expect that the music resumes playback
through the speakers right away.
For more advanced use cases,
if you receive the oldDeviceUnavailable
or newDeviceAvailable routeChangeReason, you may want
to re-evaluate certain session properties
as it applies to your application.
Lastly, let's look at how we can handle the media services
where we set the notification.
This notification is rare, but it does happen
because demons aren't guaranteed to run forever.
The important thing to note here is
that your AVAudioSession sharedInstance is still valid.
You will need to reset your category mode and other options.
You'll also need to destroy and recreate your player objects,
such as your AVAudioEngine, remote I/Os,
and other player objects as well.
And we provide a means for testing this on devices by going
to Settings, Developer, Reset Media Services.
OK, so that just recaps the four steps for working
with AVAudioSession -- the essential steps.
You sign up for notifications.
You set the appropriate category mode and options.
You manage activation of your session.
And you handle the notifications.
So let's look at some new stuff this year.
New this year, we're adding two new category options --
allowAirPlay and allowBluetoothA2DP --
to the PlayAndRecord category.
So, that means that you can now use a microphone while playing
to a Bluetooth and AirPlay destination.
So if this is your application's use case, go ahead
and set the category and the options,
and then let the user pick the route
from either an MPVolumeView or Control Center.
We're also adding a new property for VoIP apps
on our AVAudioSessionPortDescription
that'll determine whether
or not the current route has hardware voice
So if your user is connected to a CarPlay system
or a Bluetooth HFP headset that has hardware voice processing,
you can use this property
to disable your software voice processing
so you're not double-processing the audio.
If you're already using Apple's built-in voice processing IO
unit, you don't have to worry about this.
And new this year, we also introduced the
So, to see how you can enhance your VoIP apps with CallKit,
we had a session earlier this week.
And if you missed that, you can go ahead and catch it online.
So that's just an overview of AVAudioSession.
We've covered a lot of this stuff in-depth
in previous sessions.
So we encourage you to check those out,
as well as a programming guide online.
So, moving on.
So you set up AVAudioSession
if it's applicable to your platform.
Now, let's look at how you can simply play
and record audio in your application.
We'll start with the AVFoundation framework.
There are a number of classes here that can handle the job.
We have our AVAudioPlayer,
AVAudioRecorder, and AVPlayer class.
AVAudioPlayer is the simplest way to play audio from a file.
We support a wide variety of formats.
We provide all the basic playback operations.
We also support some more advanced operations,
such as setting volume level.
You get metering on a per-channel basis.
You can loop your playback, adjust the playback rate,
work with stereo panning.
If you're on iOS or tvOS, you can work
with channel assignments.
If you had multiple files you wanted to playback,
you can use multiple AVAudioPlayer objects
and you can synchronize your playback as well.
And new this year, we're adding a method that lets you fade
to volume level over a specified duration.
So let's look at a code example
of how you can use AVAudioPlayer in your application.
Let's just say I was working
and building a simple productivity app again
where I want to play an acknowledgement sound
when the user saves their document.
In this case, I have an AVAudioPlayer and a URL
to my asset in my class.
Now in my setup function, I go ahead
and I create the AVAudioPlayer object with the contents
of my URL and I prepare the player for playback.
And then, in my saveDocument function, I may do some work
to see whether or not the document was saved successfully.
And if it was, then I simply play my file.
Now, let's look at AVAudioRecorder.
This is the simplest way to record audio to a file.
You can record for a specified duration, or you can record
until the user explicitly stops.
You get metering on a per-channel basis,
and we support a wide variety of encoded formats.
So, to set up a format,
we use the Recorder Settings Dictionary.
And now this is a dictionary of keys that has --
a list of keys that let you set various format parameters
such as sample rate, number of channels.
If you're working with Linear PCM data, you can adjust things
like the bit depth and endian-ness.
If you're working with encoded formats, you can adjust things
such as quality and bit rate.
So, let's look at a code example
of how you can use AVAudioRecorder.
So the first thing I do is I create my format settings.
Here, I'm creating an AAC file with a really high bit rate.
And then the next thing I do --
I go ahead and create my AVAudioRecorder object
with a URL to the file location
and the format settings I've just defined.
And in this example, I have a simple button that I'm using
to toggle the state of the recorder.
So when I press the button, if the recorder is recording,
I go ahead and stop recording.
else -- I start my recording.
And I can use the recorders built in meters
to provide feedback to the UI.
Lastly, let's look at AVPlayer.
AVPlayer works not only with local files
but streaming content as well.
You have all the standard control available.
We also provide built-in user interfaces
that you can use directly, such as the AVPlayerView
and the AVPlayerViewController.
And AVPlayer also works with video content as well.
And this year, we added a number of new features to AVPlayer.
So if you want to find out what we did, you can check
out the Advances in AVFoundation Playback.
And if you missed that, you can go ahead and catch it online.
OK, so what we've seen so far is just some very simple examples
of playback and recording.
So now let's look at some more advanced use cases.
Advanced use cases include playing back not only from files
but working with buffers of audio data as well.
You may be interested in doing some audio processing,
applying certain effects
and mixing together multiple sources.
Or you may be interested in implementing 3D audio.
So, some examples of this are you're building a classic
karaoke app, you want to build a deejay app
with really amazing effects, or you want to build a game
and really immerse your user in it.
So, for such advanced use cases, we have a class
in AVFoundation called AVAudioEngine.
AVAudioEngine is a powerful,
feature-rich Objective-C and Swift API.
It's a real-time audio system, and it simplifies working
with real-time audio
by providing a non-real-time interface for you.
So this has a lot of complexities dealing
with real-time audio,
and it makes your code that much simpler.
The Engine manages a graph of nodes,
and these nodes let you play and record audio.
You can connect these nodes in various ways
to form many different processing chains
and perform mixing.
You can capture audio at any point
in the processing chain as well.
And we provide a special node
that lets you spatialize your audio.
So, let's look
at the fundamental building block -- the AVAudioNode.
We have three types of nodes.
We have source nodes, which provide data for rendering.
So these could be your PlayerNode,
an InputNode, or a sampler unit.
We have processing nodes that let you process audio data.
So these could be effects such as delays,
distortions, and mixers.
And we have the destination node,
which is the termination node in your graph,
and it's connected directly to the output hardware.
So let's look at a sample setup.
Let's just say I'm building a classic karaoke app.
In this case, I have three source nodes.
I'm using the InputNode to capture the user's voice.
I'm using a PlayerNode to play my Backing Track.
I'm using another PlayerNode to play other sound effects
and feedback used to the user.
In terms of processing nodes,
I may want to apply a specific EQ to the user's voice.
And then I'm going to use the mixer
to mix all three sources into a single output.
And then the single output will then be played
through the OutputNode and then out to the output hardware.
I can also capture the user's voice and do some analysis
to see how well they're performing
by installing a TapBlock.
And then based on that,
I can unconditionally schedule these feedback queues
to be played out.
So let's now look at a sample game setup.
The main node of interest here is the EnvironmentNode,
which simulates a 3D space
and spatializes its connected sources.
In this example, I'm using the InputNode as well
as a PlayerNode as my source.
And you can also adjust various 3D mixing properties
on your sources as well, such as position, occlusion.
And in terms of the EnvironmentNode,
you can also adjust properties there,
such as the listenerPosition as well as other reverb parameters.
So this 3D Space can then be mixed in with a Backing Track
and then played through the output.
So before we move any further with AVAudioEngine,
I want to look at some fundamental core classes
that the Engine uses extensively.
I'll first start with AVAudioFormat.
So, AVAudioFormat describes the data format
in an audio file or stream.
So we have our standard format, common formats,
as well as compressed formats.
This class also contains an AVAudioChannelLayout
which you may use when dealing with multichannel audio.
It's a modern interface
to our AudioStreamBasicDescription
structure and our AudioChannelLayout structure.
Now, let's look at AVAudioBuffer.
This class has two subclasses.
It has the AVAudioPCMBuffer, which is used to hold PCM data.
And it has the AVAudioCompressBuffer,
which is used for holding compressed audio data.
Both of these classes provide a modern interface
to our AudioBufferList and our AudioStreamPacketDescription.
Let's look at AVAudioFile.
This class lets you read and write from any supported format.
It lets you read data into PCM buffers and write data
into a file from PCM buffers.
And in doing so,
it transparently handles any encoding and decoding.
And it supersedes now our AudioFile and ExtAudioFile APIs.
Lastly, let's look at AVAudioConverter.
This class handles audio format conversion.
So, you can convert between one form of PCM data to another.
You can also convert between PCM and compressed audio formats
in which it handles the encoding and decoding for you.
And this class supersedes our AudioConverter API.
And new this year, we've also added a minimum phase sample
rate converter algorithm.
So you can see that all these core classes really work
together when interfacing with audio data.
Now, let's look at how these classes then interact
So if you look at AVAudioNode, it has both input
and output AVAudio formats.
If you look at the PlayerNode, it can provide you to the Engine
from an AVAudioFile or an AVAudioPCMBuffer.
When you install a NodeTap, the block provides audio data to you
in the form of PCM buffers.
You can do analysis with it, or then you can save it
to a file using an AVAudio file.
If you're working with a compressed stream,
you can break it down into compress buffers,
use an AVAudioConverter to convert it to PCM buffers,
and then provide it to the Engine through the PlayerNode.
So, new this year, we're bringing a subset
of AVAudioEngine to the Watch.
Along with that, we're including a subset of AVAudioSession,
as well as all the core classes you've just seen.
So I'm sure you'd love to see a demo of this.
So we have that for you.
We built a simple game using both SceneKit
and AVAudioEngine directly.
And in this game, what I'm doing is I'm launching an asteroid
And at the bottom of the screen, I have a flame.
And I can control the flame using the Watch's Digital Crown.
And now if the asteroid makes contact with the flame,
it plays this really loud explosion sound.
So, let's see this.
[ Explosions ]
I'm sure this game, like, defies basic laws of physics
because it's playing audio in space.
Right? And that's not possible.
All right, so let me just go
over quickly the AVAudioEngine code in this game.
So, in my class, I have my AVAudioEngine.
And I have two PlayerNodes --
one for playing the explosion sound,
and one for playing the launch sound.
I also have URLs to my audio assets.
And in this example, I'm using buffers
to provide data to the engine.
So, let's look at how we set up the engine.
The first thing I do is I go ahead
and I attach my PlayerNodes.
So I touch the explosionPlayer and the launchPlayer.
Next, I'm going to use the core classes.
I'm going to create an AVAudio file from the URL of my assets.
And then, I'm going to create a PCM buffer.
And I'm going to read the data
from the file into the PCM buffer.
And I can do this because my audio files are really short.
Next, I'll go ahead and make the connections
between the source nodes and the engine's main mixer.
So, when the game is about to start, I go ahead
and I start my engine and I start my players.
And when I launch an asteroid, I simply schedule the launchBuffer
to be played on the launchPlayer.
And when the asteroid makes contact with the flame,
I simply schedule the explosionBuffer to be played
on the explosionPlayer.
So, with a few lines of code,
I'm able to build a really rich audio experience
for my games on watchOS.
And that was a simple example, so we can't wait
to see what you come up with.
So, before I wrap up with AVAudioEngine, I want to talk
about multichannel audio
and specifically how it relates to tvOS.
So, last October, we introduced tvOS along
with the 4th generation Apple TV.
And so this is the first time we can talk about it at WWDC.
And one of the interesting things about audio
on Apple TV is that many users are already connected
to multichannel hardware
since many home theater systems already support 5.1
or 7.1 surround sound systems.
So, today, I just want to go
over how you can render multichannel audio
So, first, let's review the setup with AVAudioSession.
I first set my category and other options,
and then I activate my session to configure the hardware
for my application's needs.
Now, depending on the rendering format I want to use,
I'll first need to check and see
if the current route supports it.
And I can do that by checking if my desired number
of channels are less than or equal
to the maximum number of output channels.
And if it is, then I can go ahead
and set my preferred number of output channels.
I can then query back the actual number of channels
from the session and then use that moving forward.
Optionally, I can look at the array
of ChannelDescriptions on the current port.
And each ChannelDescription gives me a channelLabel
and a channelNumber.
So I can use this information to figure out the exact format
and how I can map my content to the connected hardware.
Now, let's switch gears and look at the AVAudioEngine setup.
There are two use cases here.
The first use case is
if you already have multichannel content.
And the second use case is if you have mono content
and you want to spatialize it.
And this is typically geared towards games.
So, in the first use case, I have multichannel content
and multichannel hardware.
I simply get the hardware format.
I set that as my connection
between my Mixer and my OutputNode.
And on the source side, I get the content format and I set
that as my connection between my SourceNode and the Mixer.
And here, the Mixer handles the channel mapping for you.
Now, in the second use case, we have a bunch of mono sources.
And we'll use the EnvironmentNode
to spatialize them.
So, like before, we get the hardware format.
But before we set the compatible format, we have to map it to one
that the EnvironmentNode supports.
And for a list of supported formats,
you can check our documentation online.
So, I set the compatible format.
And now on the source side, like before, I get the content format
and I set that as my connection between my player
and the EnvironmentNode.
Lastly, I'll also have
to set the multichannel rendering algorithm
to SoundField, which is what the EnvironmentNode
And at this point, I can start my engine, start playback,
and then adjust all the various 3D mixing properties
that we support.
So, just a recap.
AVAudioEngine is a powerful, feature-rich API.
It simplifies working with real-time audio.
It enables you to work with multichannel audio and 3D audio.
And now, you can build games
with really rich audio experiences on your Watch.
And it supersedes our AUGraph and OpenAL APIs.
So we've talked a bit about the Engine in previous sessions,
so we encourage you to check those out if you can.
And at this point, I'd like to hand it over to my colleague,
Doug, to keep it rolling from here.
Thank you, Saleem.
So, I'd like to continue our tour
through the audio APIs here.
We talked about real-time audio in passing with AVAudioEngine.
Saleem emphasized that,
while the audio processing is happening in real-time context,
we're controlling it from non-real-time context.
And that's the essence of its simplicity.
But there are times when you actually want to do work
in that real-time process, or context.
So I'd like to go into that a bit.
So, what is real-time audio?
The use cases where we need to do things
in real-time are characterized by low latency.
Possibly the oldest example I'm familiar
with on our platforms is with music applications.
For example, you may be synthesizing a sound
when the user presses a key on the MIDI keyboard.
And we want to minimize the time from when
that MIDI note was struck to when the note plays.
And so we have real-time audio effects like guitar pedals.
We want to minimize the time it takes from when the audio input
of the guitar comes into the computer
through which we process it, apply delays, distortion,
and then send it back out to the amplifier.
So we need low latency there
so that the instrument, again, is responsive.
Telephony is also characterized by low latency requirements.
We've all been on phone calls with people in other countries
and had very long delay times.
It's no good in telephony.
We do a lot of signal processing.
We need to keep the latency down.
Also, in game engines, we like to keep the latency down.
The user is doing things --
interacting with joysticks, whatever.
We want to produce those sounds as quickly as possible.
Sometimes, we want to manipulate those sounds
as they're being rendered.
Or maybe we just have an existing game engine.
In all these cases, we have a need to write code that runs
in a real-time context.
In this real-time context, the main characteristic of --
our constraint is that we're operating under deadlines.
Right? Every some-number of milliseconds,
the system is waking us up, asking us to produce some audio
for that equally-small slice of time.
And we either accomplish it and produce audio seamlessly.
Or if we fail, if we take too long to produce that audio,
we create a gap in the output.
And the user hears that as a glitch.
And this is a very small interval that we have
to create our audio in.
Our deadlines are typically as small as 3 milliseconds.
And 20 milliseconds, which is default on iOS,
is still a pretty constrained deadline.
So, in this environment, we have
to be really careful about what we do.
We can't really block.
We can't allocate memory.
We can't use mutexes.
We can't access the file system or sockets.
We can't log.
We can't even call a dispatch "async"
because it allocates continuations.
And we have to be careful not to interact with the Objective-C
and Swift runtimes because they are not entirely real-time safe.
There are cases when they, too, will take mutexes.
So that's a partial list.
There other things we can't do.
The primary thing to ask yourself is,
"Does this thing I'm doing allocate memory or use mutexes?"
And if the answer is yes, then it's not real-time safe.
Well, what can we do?
I'll show you an example of that in a little bit.
But, first, I'd like to just talk
about how we manage this problem
of packaging real-time audio components.
And we do this with an API set called Audio Units.
So this is a way for us to package --
and for you, for that matter, as another developer --
to package your signal processing and modules
that can be reused in other applications.
And it also provides an API to manage the transitions
and interactions between your non-real-time context
and your real-time rendering context.
So, as an app developer, you can host Audio Units.
That means you can let the user choose one,
or you can simply hardcode references
to system built-in units.
You can also build your own Audio Units.
You can build them as app extensions or plug-ins.
And you can also simply register an Audio Unit privately
to your application.
And this is useful, for example, if you've got some small piece
of signal processing that you want to use
in the context of AVAudioEngine.
So, underneath Audio Units,
we have an even more fundamental API
which we call Audio Components.
So this is a set of APIs in the AudioToolbox framework.
The framework maintains a registry of all
of the components on the system.
Every component has a type, subtype, and manufacturer.
These are 4-character codes.
And those serve as the key
for discovering them and registering them.
And there are a number of different kinds
of Audio Components types.
The two main categories of types are Audio Units
and Audio Codecs.
But amongst the Audio Units, we have input/output units,
generators, effects, instruments,
converters, mixers as well.
And amongst codecs, we have encoders and decoders.
We also have audio file components on macOS.
Getting into the implementation of components,
there are a number of different ways
that components are implemented.
Some of them you'll need to know about if you're writing them.
And others, it's just for background.
The most highly-recommended way to create a component now
if it's an Audio Unit is
to create an Audio Unit application extension.
We introduced this last year with our 10.11 and 9.0 releases.
So those are app extensions.
Before that, Audio Units were packaged in component bundles --
as were audio codecs, et cetera.
That goes back to Mac OS 10.1 or so.
audio components also include inter-app audio nodes on iOS.
Node applications register themselves
with a component subtype and manufacturer key.
And host applications discover node applications
through the Audio Component Manager.
And finally, you can register -- as I mentioned before --
you can register your own components for the use
of your own application.
And just for completeness,
there are some Apple built-in components.
On iOS, they're linked into the AudioToolbox.
So those are the flavors of component implementations.
Now I'd like to focus in on just one kind of component here --
the audio input/output unit.
This is and Audio Unit.
And it's probably the one component that you'll use
if you don't use any other.
And the reason is that this is the preferred interface
to the system's basic audio input/output path.
Now, on macOS, that basic path is in the Core Audio framework.
We call it the Audio HAL,
and it's a pretty low-level interface.
It makes its clients deal with interesting stream typologies
on multichannel devices for example.
So, it's much easier to deal with the Audio HAL interface
through an audio input/output unit.
On iOS, you don't even have access
to the Core Audio framework.
It's not public there.
You have to use an audio input/output unit
as your lowest-level way to get audio in and out of the system.
And our preferred interface now
for audio input/output units is AUAudioUnit
and the AudioToolbox framework.
If you've been working with our APIs for a while,
you're familiar with version 2 Audio Units that are part
of the system AUHAL on the macOS and AURemoteIO on iOS as well
as Watch -- actually, I'm not sure we have it available there.
But in any case, AUAudioUnit is your new modern interface
to this low-level I/O mechanism.
So I'd like to show you what it looks
like to use AUAudioUnit to do AudioIO.
So I've written a simple program in Swift here
that generates a square wave.
And here's my signal processing.
I mentioned earlier I would show you what kinds
of things you can do here.
So this wave generator shows you.
You can basically read memory, write memory, and do math.
And that's all that's going on here.
It's making the simplest of all wave forms -- the square wave --
at least simplest from a computational point of view.
So that class is called SquareWaveGenerator.
And let's see how to play a SqaureWaveGenerator
from an AUAudioUnit.
So the first thing we do is create an audio
And this tells us which component to go look for.
The type is output.
The subtype is something I chose here depending on platform --
either RemoteIO or HalOutput.
We've got the Apple manufacturer and some unused flags.
Then I can create my AUAudioUnit using my component description.
So I'll get that unit that I wanted.
And now it's open and I can start to configure it.
So the first thing I want to do here is find
out how many channels of audio are on the system.
There are ways to do this with AVAudioSession on iOS.
But most simply and portably,
you can simply query the outputBusses
of the input/output unit.
And outputBus is the output-directed stream.
So I'm going to fetch its format,
and that's my hardware format.
Now this hardware format may be something exotic.
It may be inertly for example.
And I don't know that I want to deal with that.
So I'm just going to create a renderFormat.
That is a standard format with the same sample rate.
And some number of channels.
Just to keep things short and simple, I'm only going
to render two channels,
regardless of the hardware channel count.
So that's my renderFormat.
Now, I can tell the I/O unit, "This is the format I want
to give you on inputBus."
So, having done this, the unit will now convert my renderFormat
to the hardwareFormat.
And in this case, on my MacBook,
it's going to take this deinterleaved floating point
and convert it to interleaved floating point buffers.
OK. So, next, I'm going
to create my square wave generators.
If you're a music and math geek like me,
you know that A440 is there, and multiplying it
by 1.5 will give you a fifth above it.
So I'm going to render A to my left channel
and E to my right channel.
And here's the code that will run in the real-time context.
There's a lot of parameters here,
and I actually only need a couple of them.
I only need the frameCount and the rawBufferList.
The rawBufferList is a difficult, low-level C structure
which I can rewrap in Swift using an overlay on the SDK.
And this takes the audio bufferList
and makes it look something like a vector or array.
So having converted the rawBufferList
to the nice Swift wrapper, I can query its count.
And if I got at least one buffer,
then I can render the left channel.
If I got at least two buffers, I can render the right channel.
And that's all the work I'm doing right here.
Of course, there's more work inside the wave generators,
but that's all of the real-time context work.
So, now, I'm all setup.
I'm ready to render.
So I'm going to tell the I/O unit,
"Do any allocations you need to do to start rendering."
Then, I can have it actually start the hardware,
run for 3 seconds, and stop.
And that's the end of this simple program.
[ Monotone ]
So, that's AUAudioUnit.
I'd like to turn next briefly to some other kinds of Audio Units.
We have effects which take audio input, produce audio output.
Instruments which take something resembling MIDI as input
and also produce audio output.
And generators which produce audio output
without anything going in except maybe some parametric control.
If I were to repackage my square wave generator as an Audio Unit,
I would make it a generator.
So to host these kinds of Audio Units,
you can also use AUAudioUnit.
You can use a separate block to provide input to it.
It's very similar to the output provider block
that you saw on the I/O unit.
You can chain together these render blocks of units
to create your own custom typologies.
You can control the units using their parameters.
And also, many units, especially third-party units,
have nice user interfaces.
As a hosting application, you can obtain
that audio unit's view, display it in your application,
and let the user interact with it.
Now if you'd like to write your own Audio Unit,
the way I would start is just building it
within the context of an app.
This lets you debug without worrying
about inter-process communication issues.
It's all in one process.
So, you start by subclassing AUAudioUnit.
You register it as a component using this class method
Then, you can debug it.
And once you've done that --
and if you decide you'd like to distribute it
as an Audio Unit extension --
you can take that same AUAudioUnit subclass.
You might fine-tune and polish it some more.
But then you have to do a small amount of additional work
to package this as an Audio Unit extension.
So you've got an extension.
You can embed it in an application.
You can sell that application on the App Store.
So I'd like to have my colleague, Torrey,
now show you some of the power of Audio Unit extensions.
We've had some developers doing some really cool things
with it in the last year.
How is everybody doing?
Happy to be at WWDC?
Let's make some noise.
I'm going to start here by launching -- well, first of all,
I have my instrument here.
This is my iPad Pro.
And I'm going to start by launching Arturia iSEM --
a very powerful synthesizer application.
And I have a synth trumpet sound here that I like.
[ Music ]
So I like this sound and I want to put it
in a track that I'm working on.
This is going to serve as our Audio Unit plug-in application.
And now I'm going to launch GarageBand, which is going
to serve as our Audio Unit host application.
Now, in GarageBand, I have a sick beat I've been working
on that I'm calling WWDC Demo.
Let's listen to it.
[ Music ]
Well move into what I call "the verse portion" next.
[ Music ]
And next, we're going to work on this chorus here.
This is supposed to be the climax of the song.
I want some motion.
I want some tension.
And let's create that by bringing in an Audio Unit.
I'm going to add a new track here.
Adding an instrument, I'll see Audio Units is an option here.
If I select this, then I can see all of the Audio Units
that are hosted here on the system.
Right now, I see Arturia iSEM because I practice this at home.
Selecting iSEM, GarageBand is now going
to give me an onscreen MIDI controller that I can use here.
It's complete with the scale transforms and arpeggiator here
that I'm going to make use of because I like a lot of motion.
Over here on the left, you can see a Pitch/Mod Wheel.
You can even modify the velocity.
And here is the view that the Audio Unit has provided to me
that I can actually tweak.
For now, I'm going to record in a little piece here
and see what it sounds like in context.
[ Music ]
All right, pretty good.
Let's see what it sounds like in context.
[ Music ]
There we go.
That's the tension that I want.
Now, let me dig in here a little bit more
and show you what I've done.
I'm going to edit here.
And I'll look into this loop a little bit more.
There are two observations that I'd like you to make here.
The first one is that these are MIDI events.
The difference between using inter-app audio
and using Audio Units
as a plug-in is you'll actually get MIDI notes here,
which is much easier to edit after the fact.
The other observation I'd like you to make here is
that you see these individual MIDI notes here
but you saw me play one big, fat-fingered chord.
So, it's because I've taken advantage
of the arpeggiator that's built into GarageBand
that I've got these individual notes.
And I can play around with these if I want to
and make them sound a bit more human.
But I'm happy with this recording as it is.
The last thing that I'd actually like to show you here is, first,
I'm going to copy this into the adjacent cell.
And I told you earlier that the Audio Unit view that's provided
here is actually interactive.
It's not just a pretty picture.
So if you were adventurous, you could even try
to give a little performance for your friends.
[ Music ]
Turn it up a little bit.
[ Music ]
Let's wrap it up.
[ Music ]
That concludes my demo.
I want to thank you for your time, your attention,
and always for making dope apps.
Thank you, Torrey.
So, just to recap here, you can see the session we did last year
about Audio Unit extensions.
It goes into a lot more detail about the mechanics of the API.
We just wanted to show you here what people have been doing
with it because it's so cool.
So, speaking of MIDI, we saw how GarageBand recorded Torrey's
performance as MIDI.
We have a number of APIs in the system
that communicate using MIDI, and it's not always clear
which ones to use when.
So I'd like to try to help clear that up just a little bit.
Now, you might just have a standard MIDI file like --
well, an ugly cellphone ringtone.
But MIDI files are very useful in music education.
I can get a MIDI file of a piece I want to learn.
I can see what all the notes are.
So if you have a MIDI file, you can play it back
And that will play it back into the context of an AVAudioEngine.
If you wish to control a software synthesizer
as we saw GarageBand doing with iSEM, the best API to do
that with is AUAudioUnit.
And if you'd like your AUAudioUnit to play back
into your AVAudioEngine, you can use AVAudioMIDIInstrument.
Now there's the core MIDI framework
which people often think does some
of these other higher-level things.
But it's actually a very low-level API that's basically
for communicating with MIDI hardware --
for example, an external USB MIDI interface
or a Bluetooth MIDI keyboard.
We also supply a MIDI network driver.
You can use that to send raw MIDI messages between an iPad
and a MacBook for example.
You can also use the core MIDI framework to send MIDI
between processes in real time.
Now this gets into a gray area sometimes.
People wonder, "Well, should I use core MIDI to communicate
between my sequencer and my app that's listening
to MIDI and synthesizing?"
And I would say that's probably not the right API for that case.
If you're using MIDI and audio together,
I would use AUAudioUnit.
It's in the case where you're doing pure MIDI
in two applications or two entities
within an application -- maybe one is a static library
from another developer.
In those situations, you can use core MIDI for inter-process
or inter-entity real-time MIDI.
So that takes us to the end
of our grand tour of the audio APIs.
We started with applications -- and at the bottom,
the CoreAudio framework and drivers.
We looked at AVAudioEngine, how you use AVAudioSession
to get things setup on all of our platforms except macOS.
We saw how you can use AVAudioPlayer
and the AVAudioRecorder for simple playback
and recording from files.
Or if your files or network streams involve video,
you can use AVPlayer.
AVAudioEngine is a very good, high-level interface
for building complex processing graphs
and will solve a lot of problems.
You usually won't have to use any of the lower-level APIs.
But if you do, we saw how in AudioToolbox there's AUAudioUnit
that lets you communicate directly with the I/O cycle
and third-party, or your own instruments,
effects, and generators.
And finally, we took a quick look at the core MIDI framework.
So that's the end of my talk here.
You can visit this link for some more information.
We have a number of related sessions here.
Thank you very much.
Looking for something specific? Enter a topic above and jump straight to the good stuff.
An error occurred when submitting your query. Please check your Internet connection and try again.