Apple Developer Connection
Member Login Log In | Not a Member? Contact ADC

< Previous PageNext Page > Hide TOC

Audio Unit Architecture

The internal architecture of an audio unit consists of scopes, elements, connections, and channels, all of which serve the audio processing code. Figure 2-1 illustrates these parts as they exist in a typical effect unit. This section describes each of these parts in turn. For discussion on the section marked DSP in the figure, representing the audio processing code in an effect unit, see “Synthesis, Processing, and Data Format Conversion Code.”


Figure 2-1  Audio unit architecture for an effect unit

Audio unit architecture for an effect unit

In this section:

Audio Unit Scopes
Audio Unit Elements
Audio Unit Connections
Audio Unit Channels


Audio Unit Scopes

An audio unit scope is a programmatic context. Unlike the general computer science notion of scopes, however, audio unit scopes cannot be nested. Each scope is a discrete context.

You use scopes when writing code that sets or retrieves values of parameters and properties. For example, Listing 2-1 shows an implementation of a standard GetProperty method, as used in the effect unit you build in “Tutorial: Building a Simple Effect Unit with a Generic View”:

Listing 2-1  Using “scope” in the GetProperty method

ComponentResult TremoloUnit::GetProperty (
    AudioUnitPropertyID    inID,
    AudioUnitScope         inScope,   // the host specifies the scope
    AudioUnitElement       inElement,
    void                   *outData
) {
    return AUEffectBase::GetProperty (inID, inScope, inElement, outData);
}

When a host application calls this method to retrieve the value of a property, the host specifies the scope in which the property is defined. The implementation of the GetProperty method, in turn, can respond to various scopes with code such as this:

if (inScope == kAudioUnitScope_Global) {
    // respond to requests targeting the global scope
} else if (inScope == kAudioUnitScope_Input) {
    // respond to requests targeting the input scope
} else {
    // respond to other requests
}

There are five scopes defined by Apple in the AudioUnitProperties.h header file in the Audio Unit framework, shown in Listing 2-2:

Listing 2-2  Audio unit scopes

enum {
    kAudioUnitScope_Global   = 0,
    kAudioUnitScope_Input    = 1,
    kAudioUnitScope_Output   = 2,
    kAudioUnitScope_Group    = 3,
    kAudioUnitScope_Part     = 4
};

The three most important scopes are:

Host applications can also query the global scope of an audio unit to get these values.

There are two additional audio unit scopes, intended for instrument units, defined in AudioUnitProperties.h:

This version of Audio Unit Programming Guide does not discuss group scope or part scope.

Audio Unit Elements

An audio unit element is a programmatic context that is nested within a scope. Most commonly, elements come into play in the input and output scopes. Here, they serve as programmatic analogs of the signal buses used in hardware audio devices. Because of this analogy, audio unit developers often refer to elements in the input or output scopes as buses; this document follows suit.

As you may have noticed in Listing 2-1, hosts specify the element as well as the scope they are targeting when getting or setting properties or parameters. Here is that method again, with the inElement parameter highlighted:

Listing 2-3  Using “element” in the GetProperty method

ComponentResult TremoloUnit::GetProperty (
    AudioUnitPropertyID    inID,
    AudioUnitScope         inScope,
    AudioUnitElement       inElement,  // the host specifies the element here
    void                   *outData
) {
    return AUEffectBase::GetProperty (inID, inScope, inElement, outData);
}

Elements are identified by integer numbers and are zero indexed. In the input and output scopes, element numbering must be contiguous. In the typical case, the input and output scopes each have one element, namely element (or bus) 0.

The global scope in an audio unit is unusual in that it always has exactly one element. Therefore, the global scope’s single element is always element 0.

A bus (that is, an input or output element) always has exactly one stream format. The stream format specifies a variety of characteristics for the bus, including sample rate and number of channels. Stream format is described by the audio stream description structure (AudioStreamBasicDescription), declared in the CoreAudioTypes.h header file and shown in Listing 2-4:

Listing 2-4  The audio stream description structure

struct AudioStreamBasicDescription {
    Float64 mSampleRate;        // sample frames per second
    UInt32  mFormatID;          // a four-char code indicating stream type
    UInt32  mFormatFlags;       // flags specific to the stream type
    UInt32  mBytesPerPacket;    // bytes per packet of audio data
    UInt32  mFramesPerPacket;   // frames per packet of audio data
    UInt32  mBytesPerFrame;     // bytes per frame of audio data
    UInt32  mChannelsPerFrame;  // number of channels per frame
    UInt32  mBitsPerChannel;    // bit depth
    UInt32  mReserved;          // padding
};
typedef struct AudioStreamBasicDescription AudioStreamBasicDescription;

An audio unit can let a host application get and set the stream formats of its buses using the kAudioUnitProperty_StreamFormat property, declared in the AudioUnitProperties.h header file. This property’s value is an audio stream description structure.

Typically, you will need just a single input bus and a single output bus in an audio unit. When you create an effect unit by subclassing the AUEffectBase class, you get one input and one output bus by default. Your audio unit can specify additional buses by overriding the main class’s constructer. You would then indicate additional buses using the kAudioUnitProperty_BusCount property, or its synonym kAudioUnitProperty_ElementCount, both declared in the AudioUnitProperties.h header file.

You might find additional buses helpful if you are building an interleaver or deinterleaver audio unit, or an audio unit that contains a primary audio data path as well as a sidechain path for modulation data.

A bus can have exactly one connection, as described next.

Audio Unit Connections

A connection is a hand-off point for audio data entering or leaving an audio unit. Fresh audio data samples move through a connection and into an audio unit when the audio unit calls a render callback. Processed audio data samples leave an audio unit when the audio unit’s render method gets called. The Core Audio SDK’s class hierarchy implements audio data hand-off, working with an audio unit’s rendering code.

Hosts establish connections at the granularity of a bus, and not of individual channels. You can see this in Figure 2-1. The number of channels in a connection is defined by the stream format, which is set for the bus that contains the connection.

Audio Processing Graph Connections

To connect one audio unit to another, a host application sets a property in the destination audio unit. Specifically, it sets the kAudioUnitProperty_MakeConnection property in the input scope of the destination audio unit. When you build your audio units using the Core Audio SDK, this property is implemented for you.

In setting a value for this property, the host specifies the source and destination bus numbers using an audio unit connection structure (AudioUnitConnection), shown in Listing 2-5:

Listing 2-5  The audio unit connection structure

typedef struct AudioUnitConnection {
    AudioUnit sourceAudioUnit;    // the audio unit that supplies audio
                                  //    data to the audio unit whose
                                  //    connection property is being set
    UInt32    sourceOutputNumber; // the output bus of the source unit
    UInt32    destInputNumber;    // the input bus of the destination unit
} AudioUnitConnection;

The kAudioUnitProperty_MakeConnection property and the audio unit connection structure are declared in the AudioUnitProperties.h file in the Audio Unit framework.

As an audio unit developer, you must make sure that your audio unit can be connected for it to be valid. You do this by supporting appropriate stream formats. When you create an audio unit by subclassing the classes in the SDK, your audio unit will be connectible. The default, required stream format for audio units is described in “Commonly Used Properties.”

Figure 1-7 illustrates that the entity upstream from an audio unit can be either another audio unit or a host application. Whichever it is, the upstream entity is typically responsible for setting an audio unit’s input stream format before a connection is established. If an audio unit cannot support the stream format being requested, it returns an error and the connection fails.

Render Callback Connections

A host application can send audio data to an audio unit directly and can retrieve processed data from the audio unit directly. You don’t need to make any changes to your audio unit to support this sort of connection.

To prepare to send data to an audio unit, a host defines a render callback (shown in Figure 1-7) and registers it with the audio unit. The signature for the callback is declared in the AUComponent.h header file in the Audio Unit framework, as shown in Listing 2-6:

Listing 2-6  The render callback

typedef OSStatus (*AURenderCallback)(
    void                          *inRefCon,
    AudioUnitRenderActionFlags    *ioActionFlags,
    const AudioTimeStamp          *inTimeStamp,
    UInt32                        inBusNumber,
    UInt32                        inNumberFrames,
    AudioBufferList               *ioData
);

The host must explicitly set the stream format for the audio unit’s input as a prerequisite to making the connection. The audio unit calls the callback in the host when it’s ready for more audio data.

In contrast, for an audio processing graph connection, the upstream audio unit supplies the render callback. In a graph, the upstream audio unit also sets the downstream audio unit’s input stream format.

A host can retrieve processed audio data from an audio unit directly by calling the AudioUnitRender function on the audio unit, as shown in Listing 2-7:

Listing 2-7  The AudioUnitRender function

extern ComponentResult AudioUnitRender (
    AudioUnit                     ci,
    AudioUnitRenderActionFlags    *ioActionFlags,
    const AudioTimeStamp          *inTimeStamp,
    UInt32                        inOutputBusNumber,
    UInt32                        inNumberFrames,
    AudioBufferList               *ioData
);

The Core Audio SDK passes this function call into your audio unit as a call to the audio unit’s Render method.

You can see the similarity between the render callback and AudioUnitRender signatures, which reflects their coordinated use in audio processing graph connections. Like the render callback, the AudioUnitRender function is declared in the AUComponent.h header file in the Audio Unit framework.

Audio Unit Channels

An audio unit channel is, conceptually, a monaural, noninterleaved path for audio data samples that goes to or from an audio unit’s processing code. The Core Audio SDK represents channels as buffers. Each buffer is described by an audio buffer structure (AudioBuffer), as declared in the CoreAudioTypes.h header file in the Core Audio framework, as shown in Listing 2-8:

Listing 2-8  The audio buffer structure

struct AudioBuffer {
    UInt32  mNumberChannels; // number of interleaved channels in the buffer
    UInt32  mDataByteSize;   // size, in bytes, of the buffer
    void    *mData;          // pointer to the buffer
};
typedef struct AudioBuffer AudioBuffer;

An audio buffer can hold a single channel, or multiple interleaved channels. However, most types of audio units, including effect units, use only noninterleaved data. These audio units expect the mNumberChannels field in the audio buffer structure to equal 1.

Output units and format converter units can accept interleaved channels, represented by an audio buffer with the mNumberChannels field set to 2 or greater.

An audio unit manages the set of channels in a bus as an audio buffer list structure (AudioBufferList), also defined in CoreAudioTypes.h, as shown in Listing 2-9:

Listing 2-9  The audio buffer list structure

struct AudioBufferList {
    UInt32      mNumberBuffers;  // the number of buffers in the list
    AudioBuffer mBuffers[kVariableLengthArray]; // the list of buffers
};
typedef struct AudioBufferList  AudioBufferList;

In the common case of building an n-to-n channel effect unit, such as the one you build in “Tutorial: Building a Simple Effect Unit with a Generic View,” the audio unit template and superclasses take care of managing channels for you. You create this type of effect unit by subclassing the AUEffectBase class in the SDK.

In contrast, when you build an m-to-n channel effect unit (for example, stereo-to-mono effect unit), you must write code to manage channels. In this case, you create your effect unit by subclassing the AUBase class. (As with the rest of this document, this consideration applies to version 1.4.3 of the Core Audio SDK, current at the time of publication.)



< Previous PageNext Page > Hide TOC


Last updated: 2007-10-31




Did this document help you?
Yes: Tell us what works for you.

It’s good, but: Report typos, inaccuracies, and so forth.

It wasn’t helpful: Tell us what would have helped.
Get information on Apple products.
Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Copyright © 2007 Apple Inc.
All rights reserved. | Terms of use | Privacy Notice