Apple Developer Connection
Member Login Log In | Not a Member? Contact ADC

< Previous PageNext Page > Hide TOC

Audio Enhancements

QuickTime 7 for Windows breaks free of the limitations of the Sound Manager, adding many new features and capabilities that developers can take advantage of in their audio playback and authoring applications.

Notably, QuickTime 7 for Windows now supports high-resolution audio, that is, audio sampled at sample rates higher than 64 kHz and up to 192 kHz, with up to 24 channels and support for surround sound. This is in stark contrast to the implementation of the Sound Manager, which only supported mono and stereo.

The result of these new audio enhancements is as follows:

Most components––with a few exceptions such as streaming and MPEG-4 exporting––will be able to make use of these new capabilities immediately. This release of QuickTime updates a number of components so that it is possible to play back, edit, and export a broad variety of enhanced audio right away.

In brief, QuickTime 7 for Windows includes the following enhancements, discussed in this section:

In this section:

New Abstraction Layer For Audio
High-Resolution Audio Support
Sound Description Creation and Accessor Functions
Audio Playback Enhancements
Audio Conversion, Export, and Extraction
Standard Audio Compression Enhancements
Audio Export Enhancements


New Abstraction Layer For Audio

QuickTime 7 for Windows introduces the audio context––a new abstraction that represents playing to an audio device.

As defined, a QuickTime audio context is an abstraction for a connection to an audio device. This allows you to work more easily and efficiently with either single or multiple audio devices in your application.

Creating an Audio Context

To create an audio context, you call QTAudioContextCreateForAudioDevice and pass in the UID of the device, which is a CFStringRef. An audio context is then returned. You can then pass that audio content either into NewMovieFromProperties, or you can open your movie however you would normally open it and call SetMovieAudioContext. What that does is route all the sound tracks of the movie to that particular device.

If you want to create an audio context and assign a device to it on Windows, use the following call:

extern OSStatus
QTAudioContextCreateForAudioDevice(
  CFAllocatorRef       allocator,
  CFStringRef          audioDeviceUID,
  CFDictionaryRef      options,
  QTAudioContextRef *  newAudioContextOut);

Then use the SetMovieAudioContext call on your movie, and it will play to that device.

To get a list of devices on Windows––so you can pass an audioDeviceUID CFString to QTAudioContextCreateForAudioDevice––use the native Windows DirectSound APIs (specifically, the DirectSoundEnumerate function, from dsound.h). Iterating through the list of DirectSound devices, you get a callback for each one that gives you the device’s GUID (LPGUID), description (LPCSTR), and module (LPCSTR). The most important one is the description, which is the device’s name. QuickTime uses this as its device UID on Windows. So once you find the device you want, you create a CFStringRef for it, and pass this to QTAudioContextCreateForAudioDevice.

Important:  On Windows, the audioDeviceUID is the GUID of a DirectSound device, stringified using such Win32 functions as StringFromCLSID or StringFromGUID2, then wrapped in a CFStringRef using CFStringCreateWithCharacters. After passing the audioDeviceUID CFStringRef to QTAudioContextCreateForAudioDevice, remember to CFRelease the CFStringRef you created.

Note: If you want to route two different movies to the same device, you cannot use the same audio context because the audio context is a single connection to that device. What you do is call QTAudioContextCreateForAudioDevice again and pass in the same device UID to get another AudioContext for the same device, and pass that to your second movie.

High-Resolution Audio Support

High-resolution audio makes use of an enhanced sound description with the ability to describe high sampling rates, multiple channels, and more accurate audio representation and reproduction.

Significantly, the new sound description has larger fields to describe the sampling rate and number of channels, so that the sound description is no longer the limiting factor for these characteristics.

The sound description has built-in support for variable-bit-rate (VBR) audio encoding with variable-duration compressed frames. Extensions to the sound description allow you to describe the spatial layout of the channels, such as quadraphonic and 5.1 surround sound, or to label channels as discrete––that is, not tied to a particular geometry. For more information, see “SoundDescriptionV2”.

New movie audio properties include a summary channel layout property, providing a nonredundant listing of all the channel types used in the movie—such as L/R for stereo, or L/R/Ls/Rs/C for 5-channel surround sound—and a device channel layout, listing all the channel types used by the movie’s output device.

Figure 2-2 shows the layout of surround speakers. The terminology is defined in Table 2-1.


Figure 2-2  Layout of surround speakers

Figure 2-2 Layout of surround speakers

Table 2-1  Surround sound definitions

Speaker

Definition

L

Left speaker

R

Right speaker

C

Center speaker

Ls

Left surround speaker

Rs

Right surround speaker

LFE

Sub-woofer (Note that LFE is an abbreviation for low-frequency effects)

The new sound description is supported by the data types and structures found in CoreAudioTypes.h. While the Core Audio API itself is not available to Windows programmers, QuickTime for Windows does include the relevant data structures, such as audio buffers, stream descriptions, and channel layouts defined in CoreAudioTypes.h.

A suite of functions has been included to support the handling of sound descriptions opaquely.

Playback

Playback at the high level is automatic and transparent; if you play a movie that contains 96 kHz or 192 kHz sound, it should just work. You should not have to modify your code. The same is true for cut-and-paste editing. If the chosen output device does not support the channel layout, sampling rate, or sample size of the movie audio, mix-down and resampling are performed automatically.

Import of high-resolution audio is automatic, provided the import component has been updated to support high-resolution audio.

QuickTime will play to any device that has a DirectSound driver. If it appears in the list of devices when you call DirectSoundEnumerate, QuickTime will play to it.

There are some caveats, however. If it is a device that only accepts a compressed stream (that is, an AC-3 stream), you will only hear silence. QuickTime does not provide CoreAudio HAL or AudioUnit or AudioConverter interfaces on Windows, so you cannot use these to query devices. You may use the facilities that DirectSound provides. QuickTime provides an audio panel in the QuickTime Preferences on Windows that lets users to specify the channel layout, sample rate, and bit depth of their playback device.

QuickTime Settings Dialogs

QuickTime 7 for Windows introduces a new series of settings dialogs for improved control over audio playback.

QuickTime does not manage a list of devices that is separate from the list available in Windows. If you want to set your input and output devices, you click in the appropriate settings field. In the Sound Out area of the Audio tab in the QuickTime Settings dialog (Figure 2-3), the sound characteristics of the playback device are not changed. If the user specifies 48 kHz, for example, QuickTime will not go to the output device and flip a switch to that setting.

Figure 2-3 illustrates the QuickTime Settings dialog with the Audio tab selected. Thisdialog lets users configure QuickTime to use the proper settings for their audio device. Users can select the output format supported by aparticular audio device, including sample rate, sample size, and number of channels.(You cannot use this dialog to change the settings of the actual output device, only the format that QuickTime sends to the device.)


Figure 2-3  The QuickTime Settings dialog with the Audio tab selected

Figure 2-3 The QuickTime Settings dialog with the Audio tab selected

Figure 2-4 illustrates the various settings that are available for audio, including channel assignment. This information panel in QuickTime Player lets users modify some aspects of a movie’s audio, such as volume, balance, bass, and treble.


Figure 2-4  Audio settings dialog, including channel assignment

Figure 2-4 Audio settings dialog, including channel assignment

Figure 2-5 illustrates the Sound Settings dialog in QuickTime 7 for Windows. This audio compression settings dialog comes up during an export.


Figure 2-5  Sound settings dialog that includes user-definable choices for format, channels, and rate of audio playback

Figure 2-5 Sound settings dialog that includes user-definable choices for format, channels, and rate of audio playback

Export

Export of high-resolution audio is likewise transparent at the high level. Export at the lower levels requires some additional code. Your application must “opt in” to the new audio features explicitly if it “talks” directly to an export component instance. You do this by calling QTSetComponentProperty on the exporter component instance and passing in the kQTMovieExporterPropertyID_EnableHighResolutionAudioFeatures property. This is illustrated in the code sample Listing 2-6.

Sound Description Creation and Accessor Functions

QuickTime 7 for Windows provides new functions that let you create, access, and convert sound descriptions.

Sound descriptions can take three basic inputs: an AudioStreamBasicDescription, a channel layout, and a magic cookie. Sound descriptions are now treated as if they are opaque. In QuickTime 7, when you are handed a sound description, for example, you don’t have to go in and look at the version field.

If you want to create a sound description, you can simply hand it an AudioStreamBasicDescription, an optional channel layout if you have one, and an optional magic cookie if you need one for the described audio format. Note that it is the format (codec) of the audio that determines whether it needs a magic cookie, not the format of the sound description.

By calling QTSoundDescriptionCreate, you can make a sound description of any version you choose––for example, one that is of the lowest possible version, given that it is stereo and 16-bit, or one of any particular version you want or request.

The main point about the new API is the capability provided to create a sound description and the usage of new property getters and setters. To accomplish this, follow these steps:

  1. Get an AudioStreamBasicDescription from a sound description.

  2. Get a channel layout from a sound description (if there is one).

  3. Get the magic cookie from magic cookie (if there is one).

You can also:

  1. Get a user-readable textual description of the format described by the SoundDescription.

  2. Add or replace a channel layout to an existing sound description. For example, this is what QuickTime Player does in the properties panel where the user can change the channel assignments.

  3. Add a magic cookie to a sound description. (This is not needed very often unless you are writing a movie importer, for example.)

To convert an existing QuickTime sound description into the new V2 sound description, you call QTSoundDescriptionConvert. This lets you convert sound descriptions from one version to another.

For a description of versions 0 and 1 of the SoundDescription record, see the documentation for the QuickTime File Format.

For a description of version 2 of the SoundDescription record, see “SoundDescriptionV2”. For details of the sound description functions, see QTSoundDescriptionCreate and QTSoundDescriptionConvert.

For details on getting and setting sound description properties, see QTSoundDescriptionGetProperty andQTSoundDescriptionSetProperty

Audio Playback Enhancements

In addition to playing back high-resolution audio, QuickTime 7 for Windows introduces the following audio playback enhancements:

Preventing Pitch-Shifting

A new property is available for use with the NewMovieFromProperties function: kQTAudioPropertyID_RateChangesPreservePitch. When this property is set, changing the movie playback rate will not result in pitch-shifting of the audio. This allows you to fast-forward through a movie without hearing chipmunks.

Setting this property also affects playback of scaled edits, making it possible to change the tempo of a sound segment or scale it to line up with a video segment, for example, without changing the pitch of the sound.

Gain, Mute, and Balance

New functions are available to set the left-right balance for a movie, set the gain for a movie or track, or to mute and unmute a movie or track without changing the gain or balance settings.

The gain and mute functions duplicate existing functions for setting track and movie volume, but the new functions present a simpler and more consistant programmer interface.

For example, to mute the movie using the old SetMovieVolume function, you would pass in a negative volume value; to preserve the current volume over a mute and unmute operation, you had to first read the volume, then negate it and set it for muting, then negate it and set it again to unmute. By comparison, the new SetMovieAudioMute function simply mutes or unmutes the movie without changing the gain value.

Note: The values set using these functions are not persistent; that is, they are not saved with the movie.

For details, see

Level and Frequency Metering

It is now easy to obtain real-time measurements of the average audio output power level in one or more frequency bands.

The only mix supported for volume metering is DeviceMix:

You can specify the number of frequency bands to meter. QuickTime divides the possible frequency spectrum (approximately half the audio sampling rate) into that many bands. You can ask QuickTime for the center frequency of each resulting band for display in your user interface. The GetMovieAudioFrequencyMeteringBandFrequencies function returns an array containing the center frequencies of each band.

When using kQTAudioMeter_DeviceMix (which is the only option currently offered for Volume Metering), levels are computed for each audio channel as it is presented to the output device. In order to obtain accurate frequency metering information for N-channel devices without requiring that N compute-intensive spectral analyses be performed, the kQTAudioMeter_MonoMix and kQTAudioMeter_StereoMix options direct QuickTime to perform an audio mix-down before computing the frequency levels.

For example, if you are playing movies to a 5.1 output device, you might want to meter the frequency levels of all six output channels.However, if you are playing stereo content, the levels for four of the outputs would always be zero, so you might prefer to meter just what would be played on a stereo device.

To use the frequency metering API, follow these steps:

  1. Set the number of frequency bands to meter using SetMovieAudioFrequencyMeteringNumBands.

  2. Call GetMovieAudioFrequencyMeteringBandFrequencies if you need to know the frequencies of the resulting bands.

  3. Finally, make periodic calls to GetMovieAudioFrequencyLevels to obtain measurements in all specified bands. You can obtain either the average values, the peak hold values, or both.

For details, see

Audio Conversion, Export, and Extraction

The new audio extraction API lets you retrieve mixed, uncompressed audio from a movie.

Note that the audio extraction API currently only mixes audio from sound tracks. Other media types, such as muxed MPEG-1 audio inside a program stream, are not currently supported.

To use the audio extraction API, follow these steps:

  1. Begin by calling MovieAudioExtractionBegin. This returns an opaque session object that you pass to subsequent extraction routines.

  2. You can then get the AudioStreamBasicDescription for the audio or layout. Note that some properties are of variable size, such as the channel layout, depending on the audio format, so getting the information involves a two-step process.

    1. First, you call MovieAudioExtractionGetPropertyInfo to find out how much space to allocate.

    2. Next, call MovieAudioExtractionGetProperty to obtain the actual value of the property.

  3. You can use the AudioStreamBasicDescription to specify a different uncompressed format than Float 32. This causes the extraction API to automatically convert from the stored audio format into your specified format.

  4. Use the MovieAudioExtractionSetProperty function to specify channel remapping––that is, a different layout––sample rate conversion, and preferred sample size. You can also use this function to specify interleaved samples (default is non-interleaved) or to set the movie time to an arbitrary point.

Note that there are basically two things you set here: an audio stream basic description (ASBD) and a channel layout. (ASBD sets the format, sample, number of channels, interleavings, and so on.)

Setup is now complete. You can now make a series of calls to MovieAudioExtractionFillBuffer to receive uncompressed PCM audio in your chosen format.

  1. The default is for the first call to begin extracting audio at the start of the movie, and for subsequent calls to begin where the last call left off, but you can set the extraction point anywhere in the movie timeline by calling MovieAudioExtractionSetProperty and setting the movie time.

  2. MovieAudioExtractionFillBuffer will set kMovieAudioExtractionComplete in outFlags when you reach the end of the movie audio.

  3. You must call MovieAudioExtractionEnd when you are done. This deallocates internal buffers and data structures that would otherwise continue to use memory and resources.

A caveat: Ideally, the uncompressed samples would be bitwise identical whether you obtained the samples by starting at the beginning of the movie and iterating through it, or by randomly setting the movie time and extracting audio samples. This is typically the case, but for some compression schemes the output of the decompressor depends not only on the compressed sample, but the seed value in the decompressor that remains after previous operations.

The current release of QuickTime does not perform the necessary work to determine what the seed value would be when the movie time is changed prior to extracting audio; while the extracted audio is generally indistinguishable by ear, it may not always be bitwise identical.

For details about audio conversion, export, and extraction, refer to the information about the following functions:

Standard Audio Compression Enhancements

QuickTime 7 for Windows introduces a new standard compressor component, StandardCompressionSubTypeAudio, that adds the ability to configure high-resolution audio output formats. It has a full set of component properties to make configuration easier, especially when the developer wishes to bring up an application-specific dialog, or no dialog, rather than the typical compression dialog.

This component essentially replaces the StandardCompressionSubTypeSound component, which is limited to 1 or 2 channel sound with sampling rates of 65 kHz or less. That component is retained for backward compatability with existing code, but its use is no longer recommended.

The StandardCompressionSubTypeAudio component is configured by getting and setting component properties, instead of using GetInfo and SetInfo calls. These properties have a class and ID, instead of just a single selector.

The component property API allows configuration at any level of detail without requiring a user interface dialog or direct communication with low-level components.

For details, refer to the section “Audio Property Selectors.”

Note: You can also configure the new standard audio compression component by calling SCSetSettingsFromAtomContainer. You can pass the new standard audio compression component either a new atom container obtained from SCGetSettingsAsAtomContainer or an old atom container returned by calling the same function (SCGetSettingsAsAtomContainer) on the old SubTypeSound component.

If you use MovieExportToDataRefFromProcedures, your getProperty proc will need to support some of these property IDs as new selectors. Note that the Movie Exporter getProperty proc API is not changing to add a class (the class is implied).

Note: Not all properties can be implemented by getProperty procs; the properties that getProperty procs can implement are marked with the word "DataProc". See the inline documentation in QuickTimeComponents.h for more information.

Audio Export Enhancements

Some movie export components now support high-resolution audio.

Export of high-resolution audio is transparent at the high level. If you export from a movie containing high-resolution audio to a format whose export component supports it, the transfer of data is automatic; if the export component does not support high-resolution audio, mix-down, resampling, and sound description conversion are automatic.

Export at the lower levels requires some additional code. Your application must “opt in” to the new audio features explicitly if it talks directly to an export component instance. (This is to prevent applications that have inadvisedly chosen to “walk” the opaque atom settings structure from crashing when they encounter the new and radically different structure.) The following code snippet (Listing 2-6) illustrates the opt-in process.

Listing 2-6  Opting in for high-resolution audio export

ComponentInstance exporterCI;
ComponentDescription search = { ’spit’, ’MooV’, ’appl’, 0, 0 };
Boolean useHighResolutionAudio = true, canceled;
OSStatus err = noErr;
 
Component c = FindNextComponent(NULL, &search);
exporterCI = OpenComponent(c);
 
// Hey exporter, I understand high-resolution audio!!
(void) QTSetComponentProperty(// disregard error
        exporterCI,
        kQTPropertyClass_MovieExporter,
        kQTMovieExporterPropertyID_EnableHighResolutionAudioFeatures,
        sizeof(Boolean),
        &useHighResolutionAudio);
 
err = MovieExportDoUserDialog(exporterCI, myMovie, NULL, 0, 0, &canceled);

For additional details, see “Movie Exporter Properties”.



< Previous PageNext Page > Hide TOC


Last updated: 2005-11-09




Did this document help you?
Yes: Tell us what works for you.

It’s good, but: Report typos, inaccuracies, and so forth.

It wasn’t helpful: Tell us what would have helped.
Get information on Apple products.
Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Copyright © 2007 Apple Inc.
All rights reserved. | Terms of use | Privacy Notice