There are many tasks that you can accomplish with Core Audio. This section will outline the architecture of Core Audio, highlighting the various uses of Mac OS X’s audio technology.
Audio Data Operations
MIDI Data Operations
Higher Level Audio Operations
Interfacing with Hardware
One of the main functions of Core Audio is to work with and manipulate audio data, that is either stored on disk or already in memory. Effects can be applied to the data, and data sources mixed. Beyond that, Core Audio is also responsible for pulling data from input devices, and outputting data back out. Finally, data can be put back out to disk as a file, and may be converted to another format.
In order to use audio data from a file, it first must be read in. The Audio File API is provided for this purpose. An audio file instance can be created to act as a proxy for the file on disk, or for a buffer in memory (using callbacks).
Once the audio file has been created and bound to a file or memory, its data can be read in. If the data in the file is encoded, an audio converter is needed to convert the data into 32 bit floating-point Pulse Code Modulated (PCM) native-endian data, also known as the canonical format. Once the data is in this format, it is ready to be used in an audio unit or by another portion of Core Audio.
It is worth noting that an audio converter instance inherently uses the audio codecs available on the system. Using an audio codec directly for this kind of data conversion is discouraged, since the Audio Converter API takes care of the actual buffering and other considerations that need to be considered during a conversion.
To write a file back out to disk, simply reverse this process. Data output in the canonical format can be converted to an encoded format with an audio converter, and then saved to disk or memory via the Audio File API.
Converting a file uses a process similar to the previous example. The Audio File API is used to open the file off of disk, an audio converter takes the incoming data and converts it to the desired format, and another audio file instance is used to save the data out to disk. Again, the codecs needed to decode the incoming data and encode the outgoing data are used automatically by the converter; for instance, it is not necessary for you to read in the encoded data, convert it to the canonical format, and then encode it in the resulting format before writing it out. This service provided by the Audio Converter API.
Playing back the contents of an audio file is one of the most common tasks that developers perform. In Core Audio, this is accomplished by reading in the data using an audio file instance. Once the instance is set up, an I/O unit instance can pull on the file, extracting the audio data and outputting it to the assigned audio device. If the data is encoded in the canonical format, no further decoding is needed to output the sound. If the data is encoded, it will need to be converted into the canonical format before it can be played back.
An I/O unit is a type of audio unit that acts as a proxy for an audio device. When data is sent to it, it will be relayed to the device that it represents. The most common use of this is to send data to the default output, as specified by the user. The unit to use in this case is the Default Output unit. A System Output unit is also provided, which is discussed is the next example.
Each I/O unit inherits from AUConverter, an audio unit which owns an audio converter instance; this unit can be used in a graph to convert data between formats, sample rates, and the like. A GenericOutput unit implements adds the ability to start and stop the pulling of data to the output device.
When playing out to any piece of hardware, an AUHAL unit is needed. An instance of AUHAL can be attached to any audio device, making the instance a proxy for getting input and providing output to that device.
The Default Output unit is provided to play audio out to the user’s prefered output, as designated in the System Preferences. Likewise, the System Output unit is provided to play back to the current system out device.
Any of these I/O units may be used to pull input data from its associated audio device, through any number or combination of audio units or audio unit graphs, and output back through the I/O unit. The I/O unit itself has two busses: 0 and 1, where the 0 bus is designated as the output, and the 1 bus is the input bus. The connections between the output of the 0 bus and the audio device and the input of the 1 bus and the audio device are made when the unit is associated with the device.
To process data from a device and play it back through, simply associate the device with the unit, connect the output of the 1 bus with whatever audio units or graphs are being used to process the data, and connect the output of those units to the input of the 0 bus on the I/O unit. To start the render, tell the I/O unit to render. This, in turn, will cause the unit to ask the units attached to it to render, eventually leading back to the I/O unit’s input bus, which will pull from the audio device. The data will pass through the input bus and will work its way through all of the attached units until it reaches the I/O units output bus, where it will automatically be output to the audio device.
When working with streams of audio data, information about the data and the formats that the system has available become important. The Audio Format API provides a mechanism to get information about audio data, like the available codecs for encoding and decoding information, the encoding information for channel layouts, and panning information for use with the Matrix Mixer audio unit. Also, the Audio File API provides a function useful for determining the available file formats on the system.
MIDI stands for Music Instrument Digital Interface. Established as the standard method of communication between music devices, Core Audio features full-fledged MIDI support, including provisions for communication with MIDI devices and reading-in and playback of standard MIDI files.
The Music Sequence API is provided to sequence events for MIDI endpoints and audio units. One of its functions, though, is the ability to read in MIDI files and parse their contents into its tracks. Normally, each channel of MIDI data in the file can be made into one track in the sequence, allowing each track, and therefore, each channel of data, to be targeted at a different MIDI endpoint.
To playback the MIDI file as audio data, a music player is assigned to a sequence, and the sequence’s tracks are assigned to a music device. A music device is a particular type of audio unit that generates audio data by having its parameters altered; in this case, the event track is assigned to a music device which is part of a graph, and the events in that track contain the parameter changes needed to affect the output of the music device. The graph itself is assigned to a sequence, so that the sequence knows which instances its tracks are assigned to. Beyond that, the music player assigned to the sequence communicates with the I/O unit at the head of the graph, to ensure that all timing issues for outputting sound to the unit’s assigned device are taken care of. This is done inherently when the sequence is assigned to the graph, and so no extra steps need to be taken in order for this synchronization to happen. The compressor is included in order to make sure a constant stream of data is being supplied to the I/O unit.
To play MIDI data back through an attached MIDI device, an event track needs to be assigned to a MIDI endpoint, a proxy for a MIDI device. As with the previous example, the music player will inherently communicate with Core MIDI to ensure all timing issues are solved and that a constant amount of data is being fed to the MIDI sever, and therefore, the MIDI device.
When a MIDI control surface is being used to control the properties of a software component, like an audio unit, it will be assigned to an endpoint, which in turn, is assigned to an AUMIDIController, which will parse the incoming MIDI signals into parameter changes for use with an audio unit.
To playback the signals generated by a MIDI keyboard, a similar scheme is used. An endpoint is assigned to the keyboard, and the signals coming from the keyboard are assigned to an AUMIDIController, which, in turn, will issue parameter changes to a music device. The music device will synthesize the audio data, based on the parameters given to it via the AUMIDIController.
To take in MIDI data for saving, it is common to have already-existing data playing, while new data comes in and is recorded. The playback of existing data is handled as before, with the track being assigned to a music device, which outputs its data, via a graph, to an I/O unit. Beyond that, however, the data coming in from any MIDI device needs to be parsed and placed in another event track within the sequence. The Music Player API provides functions for determining when to place the events, based on the time the event happens.
Often, elements from the audio data operations and the MIDI data operations come together to provide a complete audio experience. These examples look at some cases where MIDI data is synthesized and also output to a MIDI device concurrently, or when events control a music device’s synthesis of audio data and the parameters of an audio unit, all while mixing in data from an encoded file being read off of disk.
In this example, you can see a music sequence being used to control the synthesis of sound, via an audio unit graph containing a music device, while additional events are sent to a MIDI endpoint, which, in turn, are assigned to MIDI devices. This is common when using the Mac as another MIDI device, generating synthesized data to accompany an external MIDI device. Note that the music player automatically takes care of all timing issues between the different outputs, ensuring that the output remains in sync.
This example focuses on an audio unit graph, which is used to mix synthesized MIDI data, via a music device, and audio data coming in from a file. This scenario is common in gaming situations, where ambient noises are saved as MIDI data, and the sound track is an encoded file on disk. Note that the sequence controls a 3D Mixer audio unit, often used to mix various audio sources and to provide a spacial orientation for the sources and the output. As with the previous example, the music player will ensure that the output is in sync with the sequence.
Most of the processing done with audio and MIDI data in Core Audio will eventually be played back via audio or MIDI hardware. As a developer, it will be helpful to you if you understand the architecture behind the hardware interfaces, even if an abstraction is used when developing an application.
When accessing audio hardware, whether it be via on-board audio inputs and outputs, USB, or other means, a driver must exist to handle the exchange of data between the hardware and the Mac. In order for the driver to be used by Core Audio, it must conform to the IO Audio Family of IOKit drivers; this means that the driver music implement IO Audio Device functionality within the driver, in order for proper communication to exist between itself and the Hardware Abstraction Layer.
The Hardware Abstraction Layer, or HAL, is provided to make discovery and access to audio hardware simpler. Each driver in the IO Audio Family is represented as an audio device in the HAL. To make communication with audio devices easier, and I/O unit may created and bound to an audio device, allowing a device to be used as a source, destination, or both in connections with audio units and audio unit graphs. This is common and encouraged when working with audio hardware.
The MIDI hardware architecture is different than that of audio hardware, in that MIDI drivers are in user space, usually working with default drivers provided by the operating system. This means that raw incoming and outgoing data is passed between the hardware and the MIDI driver, and the MIDI driver takes care of the formatting and preparation of the data. The MIDI Server than works with Core MIDI, routing MIDI data via endpoints, the abstraction provided to allow for easy access to MIDI devices.
Last updated: 2004-03-25