The QuickTime Music Architecture

This chapter describes the QuickTime music architecture (QTMA), which allows QuickTime movies, applications, and other software to play individual musical notes, sequences of notes, and a broad range of sounds from a variety of instruments and synthesizers. With QTMA, you can also import Standard MIDI files (SMF) and convert them into QuickTime movies for easy playback.

The QuickTime music architecture is implemented as Component Manager components, which is the standard mechanism that QuickTime uses to provide extensibility.

QTMA components exist both in QuickTime for Mac OS X and for Windows.

Different QTMA components are used by a QuickTime movie, depending on if you are playing music or sounds through the computer’s built-in audio device, or if you are controlling, for example, a MIDI synthesizer. During playback of a QuickTime movie, the music media handler component isolates your application and the Movie Toolbox from the details of how to actually play a music track. The task of processing the data in a music track is taken care of for you by the media handler through Movie Toolbox calls.

The following sections provide overviews of these components and their capabilities.

QTMA Components

The QuickTime music architecture includes the following components:

These components are described in more detail in the following sections. Figure 1-1 illustrates the relationships among the various QTMA components.

Figure 1-1  How QuickTime music architecture components work together

Note Allocator Component

You use the note allocator component to play individual notes. Your application can specify which musical instrument sound to use and exactly which music synthesizer to play the notes on. The note allocator component can also display an Instrument Picker, which allows the user to choose instruments. The note allocator, unlike the tune player, provides no timing-related features to manage a sequence of notes. Its features are similar to a music component, although more generalized. Typically, an application opens a connection to the note allocator, which in turn sends messages to the music component. An application or movie music track can incorporate any number of musical timbres or parts.

To play a single note, your application must open a connection to the note allocator component and call NANewNoteChannel with a note request, typically to request a standard instrument within the General MIDI library of instruments. A note channel is similar in some ways to a Sound Manager sound channel in that it needs to be created and disposed of, and can receive various commands. The note allocator provides an application-level interface for requesting note channels with particular attributes. The client specifies the desired polyphony and the desired tone. The note allocator returns a note channel that best satisfies the request.

With an open note channel, an application can call NAPlayNote while specifying the note’s pitch and velocity. The note is played and continues to play until a second call to NAPlayNote is made specifying the same pitch but with a velocity of zero. The velocity of zero causes the note to stop. The note allocator functions let you play individual notes, apply a controller change, apply a knob change, select an instrument based on a required tone, and modify or change the instrument type on an existing note channel.

There are calls for registering and unregistering a music component. As part of registration, the MIDI connections, if applicable, are specified. There is also a call for querying the note allocator for registered music components, so that an application can offer a selection of the existing devices to the user.

Tune Player Component

The tune player component can accept entire sequences of musical notes and play them start to finish, asynchronously, with no further need for application intervention. It can also play portions of a sequence. An additional sequence or sequence section may be queued-up while one is currently being played. Queuing sequences provides a seamless way to transition between sections.

The tune player negotiates with the note allocator to determine which music component to use and allocates the necessary note channels. The tune player handles all aspects of timing, as defined by the sequence of music events. For more information about music events and the event sequence that is required to produce music in a QuickTime movie track, see the section QuickTime Music Events.

The tune player also provides services to set the volume and to stop and restart an active sequence.

If your application simply wants to play background music, it may be easier to use the QuickTime Movie Toolbox, rather than call the tune player directly.

Music Components Included in QuickTime

Individual music components act as device drivers for each type of synthesizer attached to a particular computer. These music components are included in QuickTime:

  • the General MIDI synthesizer component, for playing music on a General MIDI device attached to a serial port.

  • the MIDI synthesizer component, which allows QuickTime to control a synthesizer that is connected to a single MIDI channel.

Developers can add other music components for specific hardware and software synthesizers.

Applications do not usually call music components directly. Instead, the note allocator or tune player handles music component interactions. Music components are mainly of interest to application developers who want to access the low-level functionality of synthesizers and for developers of synthesizers (internal cards, MIDI devices, or software algorithms) who want to make the capabilities of their synthesizers available to QuickTime.

In order for an application to call a music component directly, you must first allocate a note channel and then use NAGetNoteChannelInfo and NAGetRegisteredMusicDevice to get the specific music component and part number.

You can use music component functions to

  • obtain specific information about a synthesizer

  • find an instrument that best fits a requested type of sound

  • play a note with a specified pitch and volume

  • change knob values to alter instrument sounds

Other functions are for handling instruments and synthesizer parts. You can use these functions to initialize a part to a specified instrument and to get lists of available instrument and drum kit names. You can also get detailed information about each instrument from the synthesizer and get information about and set knobs and controllers.

Instrument Components and Atomic Instruments

When initialized, the note allocator searches for components of type 'inst'. These components may report a list of atomic instruments. They are called atomic instruments because you create them with QT atoms. These sounds can be embedded in a QuickTime movie, passed via a call to QuickTime, or dropped into the Macintosh System Folder.

QuickTime provides a public format for atomic instruments. Using the QuickTime calls for manipulating atoms, you construct in memory a hierarchical tree of atoms with the data that describes the instrument (see Figure 1-2). The tree of atoms lives inside an atom container. There is one and only one root atom per container. Each atom has a four-character (32-bit) type, and a 32-bit ID. Each atom may be either an internal node or a leaf atom with data.

Figure 1-2  An atomic instrument atom container

The Generic Music Component

To use a new hardware or software synthesizer with the QuickTime music architecture, you need a music component that serves as a device driver for that synthesizer and that can play notes on the synthesizer. You can simplify the creation of a music component by using the services of the generic music component.

To create a music component, you create several resources, for which you get much of the data by calling functions of the generic music component, and implement functions that the generic music component calls when necessary. When a music component is a client of the generic music component, it handles only a few component calls from applications and more relatively simple calls from the generic music component.

MIDI Components

A MIDI component provides a standard interface between the note allocator component and a particular MIDI transport system. The MIDI component supports both input and output of MIDI streams.

Hardware and software developers can provide additional MIDI components. For example, the developer of a multiport serial card can provide a MIDI component that supports direct MIDI input and output using the card. Other MIDI components can support MIDI transport systems for operating systems other than the Mac OS.

QuickTime Music Events

This section describes the data structure of QuickTime music events. The events described here are used to initialize and modify sound-producing music devices and define the notes and rests to be played. Several different event types are defined.

Music events specify the instruments and notes of a musical composition. A group of music events is called a sequence. A sequence of events may define a range of instruments and their characteristics and the notes and rests that, when interpreted, produce the musical composition.

The event sequence required to produce music is usually contained in a QuickTime movie track, which uses a media handler to provide access to the tune player, or an application, which passes them directly to the tune player. QuickTime interprets and plays the music from the sequence data.

The events described in this section initialize and modify sound-producing music devices and define the notes and rests to be played.

Events are constructed as a group of long words. The uppermost 4 bits (nibble) of an event’s long word defines its type, as shown in Table 1-1.

Table 1-1  Event types

First nibble

Number of long words

Event type

000x

1

Rest

001x

1

Note

010x

1

Controller

011x

1

Marker

1000

2

(reserved)

1001

2

Extended note

1010

2

Extended controller

1011

2

Knob

1100

2

(reserved)

1101

2

(reserved)

1110

2

(reserved)

1111

any

General

Durations of notes and rests are specified in units of the tune player’s time scale (default 1/600 second). For example, consider the musical fragment shown in Figure 1-3.

Figure 1-3  A music fragment

Assuming 120 beats per minute, and a tune player’s scale of 600, each quarter note’s duration is 300. Figure 1-4 shows a graphical representation of note and rest data.

Figure 1-4  Duration of notes and rests

The general event specifies the types of instruments or sounds used for the subsequent note events. The note event causes a specific instrument, previously defined by a general event, to play a note at a particular pitch and velocity for a specified duration of time.

Additional event types allow sequences to apply controller effects to instruments, define rests, and modify instrument knob values. The entire sequence is closed with a marker event.

In most cases, the standard note and controller events (two long words) are sufficient for an application’s requirements. The extended note event provides wider pitch range and fractional pitch values. The extended controller event expands the number of instruments and controller values over that allowed by a controller event.

The following sections describe the event types in detail.

Note Event

The standard note event (Figure 1-5) supports most music requirements. The note event allows up to 32 parts, numbered 0 to 31, and support pitches from 2 octaves below middle C to 3 octaves above.

Figure 1-5  A note event

Field

Content

note event type

First nibble value = 001X

Part number

Unique part identifier

Pitch

Numeric value of 0-63, mapped to 32-95

Velocity

0-127, 0 = no audible response (but used to indicate a NOTE OFF)

Duration

Specifies how long to play the note in units defined by the media time scale or tune player time scale

The part number bit field contains the unique part identifier initially used during the TuneSetHeader call.

The pitch bit field allows a range of 0-63, which is mapped to the values 32-95 representing the traditional equal tempered scale. For example, the value 28 (mapped to 60) is middle C.

The velocity bit field allows a range of 0-127. A velocity value of 0 produces silence.

The duration bit field defines the number of units of time during which the part will play the note. The units of time are defined by the media time scale or tune player time scale.

Use this macro call to stuff the note event’s long word:

qtma_StuffNoteEvent(x, instrument, pitch, volume, duration)

Use these macro calls to extract fields from the note event’s long word:

qtma_Instrument(x)
qtma_NotePitch(x)
qtma_NoteVelocity(x)
qtma_NoteVolume(x)
qtma_NoteDuration(x)

Extended Note Event

The extended note event (Figure 1-6) provides a wider range of pitch values, microtonal values to define any pitch, and extended note duration. The extended note event requires two long words; the standard note event requires only one.

Figure 1-6  An extended note event

Field

Content

Extended note event type

First nibble value = 1001

Part number

Unique part identifier

Pitch

0-127 standard pitch, 60 = middle C 0x01.00 ... 0x7F.00 allowing 256 microtonal divisions between each notes in the traditional equal tempered scale

Velocity

0-127 where 0 = no audible response (but used to indicate a NOTE OFF)

Duration

Specifies how long to play the note in units defined by media time scale or tune player time scale

Event tail

First nibble of last word = 10XX

The part number bit field contains the unique part identifier initially used during the TuneSetHeader call.

If the pitch bit field is less than 128, it is interpreted as an integer pitch where 60 is middle C. If the pitch is 128 or greater, it is treated as a fixed pitch.

Microtonal pitch values are produced when the 15 bits of the pitch field are split. The upper 7 bits define the standard equal tempered note and the lower 8 bits define 256 microtonal divisions between the standard notes.

Use this macro call to stuff the extended note event’s long words:

qtma_StuffXNoteEvent(w1, w2, instrument, pitch, volume, duration)

Use these macro calls to extract fields from the extended note event’s long words:

qtma_XInstrument(m, l)
qtma_XNotePitch(m, l)
qtma_XNoteVelocity(m, 1)
qtma_XNoteVolume(m, l)
qtma_XNoteDuration(m, l)

Rest Event

The rest event (Figure 1-7) specifies the period of time, defined by either the media time scale or the tune player time scale, until the next event in the sequence is played.

Figure 1-7  A rest event

Field

Content

Rest event type

First nibble value = 000X

Duration

Specifies the number of units of time until the next note event is played in units defined by media time scale or tune player time scale

Use this macro call to stuff the rest event’s long word:

qtma_StuffRestEvent(x, duration)

Use this macro call to extract the rest event’s duration value:

qtma_RestDuration(x)

Rest events are not used to cause silence in a sequence, but to define the start of subsequent events.

Marker Event

The marker event has three subtypes. The end marker event (Figure 1-8) marks the end of a series of events. The beat marker event marks the beat and the tempo marker event indicates the tempo.

Figure 1-8  A marker event of subtype end

Field

Content

Marker event type

First nibble value = 011X

Subtype

8-bit unsigned subtype

Value

16-bit signed value

The marker subtype bit field contains zero for an end marker (kMarkerEventEnd), 1 for a beat marker (kMarkerEventBeat), or 2 for a tempo marker (kMarkerEventTempo).

The value bit field varies according to the subtype:

  • For an end marker event, a value of 0 means stop; any other value is reserved.

  • For a beat marker event, a value of 0 is a single beat (a quarter note); any other value indicates the number of fractions of a beat in 1/65536 beat.

  • For a tempo marker event, the value is the same as a beat marker, but indicates that a tempo event should be computed (based on where the next beat or tempo marker is) and emitted upon export.

Use this macro call to stuff the marker event’s long word:

qtma_StuffMarkerEvent(x, markerType, markerValue)

Use these macro calls to extract fields from the marker events long word:

qtma_MarkerSubtype(x)
qtma_MarkerValue(x)

Controller Event

The controller event (Figure 1-9) changes the value of a controller on a specified part.

Figure 1-9  Controller event

Field

Content

controller event type

First nibble value =010X

Part

Unique part identifier

Controller

Controller to be applied to instrument

Value

8.8 bit fixed-point signed controller specific value

For a list of currently supported controller types see Controller Numbers.

The part field contains the unique part identifier initially used during the TuneSetHeader call.

The controller bit field is a value that describes the type of controller used by the part.

The value bit field is specific to the selected controller.

Use this macro call to stuff the controller event’s long word:

qtma_StuffControlEvent(x, instrument, control, value)

Use these macro calls to extract fields from the controller event’s long word:

qtma_Instrument(x)
qtma_ControlController(x)
qtma_ControlValue(x)

Extended Controller Event

The extended controller event (Figure 1-10) allows parts and controllers beyond the range of the standard controller event.

Figure 1-10  Extended controller event

Field

Content

Extended controller type

First nibble value = 1010

Part

Instrument index for controller

Controller

Controller for instrument

Value

Signed controller specific value

Event tail

First nibble of last word = 10XX

The part field contains the unique part identifier initially used during the TuneSetHeader call.

The controller bit field contains a value that describes the type of controller to be used by the part.

The value bit field is specific to the selected controller.

Use this macro call to stuff the extended controller event’s long words:

_StuffXControlEvent(w1, w2, instrument, control, value)

Use these macro calls to extract fields from the extended controller event’s long words:

qtma_XInstrument(m, l)
qtma_XControlController(m, l)
qtma_XControlValue(m, l)

General Event

For events longer than two words, you use the general event with a subtype. Figure 1-11 illustrates the contents of a general event.

Figure 1-11  A note request general event

Field

Content

General event type

First nibble value = 1111

Part number

Unique part identifier

Event length

Head is number of words in event

Data words

Depends on subtype

Subtype

8-bit unsigned subtype

Event length

tail must be identical to head

Event tail

First nibble of last word = 11XX

The part number bit field contains a unique identifier that is later used to match note, knob, and controller events to a specific part. For example, to play a note the application uses the part number to specify which instrument will play the note. The general event allows part numbers of up to 12 bits. The standard note and controller events allow part numbers of up to 5 bits; the extended note and extended controller events allow 12-bit part numbers.

The event length bit fields contained in the first and last words of the message are identical and are used as a message format check and to move back and forth through the message. The lengths include the head and tail; the smallest length is 2.

The data words field is a variable length field containing information unique to the subtype of the general event. The subtype bit field indicates the subtype of general event. There are nine subtypes:

  • A note request general event (kGeneralEventNoteRequest) has a subtype of 1. It encapsulates the note request data structure used to define the instrument or part. It is used in the tune header.

  • A part key general event (KGeneralEventPartKey) has a subtype of 4. It sets a pitch offset for the entire part so that every subsequent note played on that part will be altered in pitch by the specified amount.

  • A tune difference general event (kGeneralEventTuneDifference) has a subtype of 5. It contains a standard sequence, with end marker, for the tune difference of a sequence piece. Using a tune difference event is similar to using key frames with compressed video sequences.

  • An atomic instrument general event (kGeneralEventAtomicInstrument) has a subtype of 6. It encapsulates an atomic instrument. It is used in the tune header. It may be used in place of the kGeneralEventNoteRequest.

  • A knob general event (kGeneralEventKnob) has a subtype of 7. It contains knob ID/knob value pairs. The smallest event is four long words.

  • A MIDI channel general event (kGeneralEventMIDIChannel) has a subtype of 8. It is used in a tune header. One long word identifies the MIDI channel it originally came from.

  • A part change general event (kGeneralEventPartChange) has a subtype of 9. It is used in a tune sequence where one long word identifies the tune part that can now take over the part’s note channel.

  • A no-op general event (kGeneralEventNoOp) has a subtype of 10. It does nothing in QuickTime.

  • A notes-used general event (kGeneralEventUsedNotes) has a subtype of 11. It is four long words specifying which MIDI notes are actually used. It is used in the tune header.

Use these macro calls to stuff the general event’s head and tail long words, but not the structures described above:

qtma_StuffGeneralEvent(w1, w2, instrument, subType, length)

Macros are used to extract field values from the event’s head and tail long words.

qtma_XInstrument(m, l)
qtma_GeneralSubtype(m, l)
qtma_GeneralLength(m, l)

Knob Event

The knob event (Figure 1-12) is used to modify a particular knob or knobs within a specified part.

Figure 1-12  Knob event

Field

Content

Knob event type

First nibble value = 1111 (general event), subtype 7

Length

Length of the event will be 2(#knobs+1)

Part

Unique part identifier

Knob ID

Knob ID within specified part

Knob value

Knob value

Event tail

First nibble of last word = 11XX, subtype 7

The part field contains the unique part identifier initially used during the TuneSetHeader call.

The knob number bit field identifies the knob to be changed.

The 32-bit value composed of the lower 16-bit and upper 16-bit field values is used to alter the specified knob.

The General MIDI Synthesizer Component

The General MIDI synthesizer component controls General MIDI devices. These devices support 24 voices of polyphony, and each of their MIDI channels can access any number of voices.

The MIDI Synthesizer Component

The MIDI synthesizer component allows QuickTime to control a synthesizer connected to a single MIDI channel. It works with any synthesizer that can be controlled through MIDI.

The MIDI synthesizer component does not get information about the synthesizer instruments. Instead, it simply lists available instruments as “Instrument 1,” “Instrument 2,” and so on up to “Instrument 128.”

The Base Instrument Component

When you provide additional sounds for the QuickTime music synthesizer, you can simplify the creation of the necessary instrument resources by using the base instrument component. To create an instrument component, you create a component alias whose target is the base instrument component. The component alias’s data resources specify the capabilities of an instrument, while the code resource of the base instrument component handles all of the component requests sent to the instrument component.