Retired Document

Important: This document may not represent best practices for current development. Links to downloads and other resources may no longer be valid.

Technical Q&A QA1539

How do I create a QuickTime movie from PCM audio samples in memory?

Q: I'm trying to create a QuickTime movie from a memory buffer of PCM audio samples (Stereo, 22.050 kHz) but I'm not having any luck. When I play the resulting movie all I get is silence. Also, how do I properly fill out a `SoundDescription` structure to describe my audio?

A: There are a couple of different ways in which to take an in-memory buffer of audio samples and convert them into an audio track of a movie. One way is to create an empty movie, create a new movie track and track media as defined by a SoundDescription structure, and insert your audio samples into the track media using AddMediaSample2. The code snippet in Listing 1 shows this technique.

In order to create a SoundDescription correctly you should first construct an AudioStreamBasicDescription structure (the fundamental descriptive structure in Core Audio) with the fields set correctly for your encoding, and then use the QTSoundDescriptionCreate function to translate these settings into a proper SoundDescription.

Listing 1 Creating a movie from PCM audio data in memory.

#import <QuickTime/QuickTime.h>

// Constants for use when creating our movie track and media

static const TimeValue  kSoundSampleDuration    = 1;

static const TimeValue  kTrackStart             = 0;

static const TimeValue  kMediaStart             = 0;

// These are custom settings which describe our audio samples.

// You'll want to change these to properly describe your own audio.

static const UInt32                 kNumChannels            = 2;

static const Float64                kSampleRate             = 22050.;

static const AudioChannelLayoutTag  kMyAudioChannelLayout   = kAudioChannelLayoutTag_Stereo;

static const long                   kNumSamples             = 11025; // .5 seconds of 22050

/*

createSoundDescription

Creates a sound description structure of the requested kind

from an AudioStreamBasicDescription, optional audio channel

layout, and optional magic cookie.

outDescHndl - pointer to a handle (empty) in which to copy

the new sound description

*/

-(OSErr) createSoundDescription: (SoundDescriptionHandle *)outDescHndl

    assert(outDescHndl != NULL);

    AudioStreamBasicDescription asbd = {0}; //see CoreAudioTypes.h

    asbd.mSampleRate           = kSampleRate;

    asbd.mFormatID             = kAudioFormatLinearPCM;

    asbd.mFormatFlags          = kAudioFormatFlagsNativeFloatPacked;

    // if multi-channel, the data format must be interleaved (non-interleaved is not allowed),

    // and you should set up the asbd accordingly

    asbd.mChannelsPerFrame     = kNumChannels; // 2 (Stereo)

    // mBitsPerChannel = number of bits of sample data for each channel in a frame of data

    asbd.mBitsPerChannel       = sizeof (Float32) * 8; // 32-bit floating point PCM

    // mBytesPerFrame = number of bytes in a single sample frame of data

    // (bytes per channel) * (channels per frame) = 4 * 2 = 8

    asbd.mBytesPerFrame        = (asbd.mBitsPerChannel>>3) // number of *bytes* per channel

                                  * asbd.mChannelsPerFrame; // channels per frame

    asbd.mFramesPerPacket      = 1; // For PCM, frames per packet is always 1

    // mBytesPerPacket = (bytes per frame) * (frames per packet) = 8 * 1 = 8

    asbd.mBytesPerPacket       = asbd.mBytesPerFrame * asbd.mFramesPerPacket;

    // The AudioChannelLayout is used to specify channel layouts

    // (see CoreAudioTypes.h) and consists of the following:

    // - a tag that indicates the layout

    // - channel usage bitmap (used if a "named" tag can't be found

    //        to describe the layout)

    // - a variable length array of AudioChannelDescriptions

    //        that describe the layout/position of a speaker (but if the

    //        tag field is non-zero it refers to one of the standard

    //        "named" layout tags, so the individual channel descriptions

    //        are just there to be more descriptive.

    UInt32 layoutSize;

    layoutSize = offsetof(AudioChannelLayout, mChannelDescriptions[0]);

    AudioChannelLayout *layout = NULL;

    layout = calloc(layoutSize, 1); // make sure all fields start cleared

    OSErr err = -1;

    if (layout != NULL)

        // You must specify a tag identifying a particular pre-defined

        // channel layout as there are many different layouts to choose.

        // In this case we are specifying the following:

        //    kAudioChannelLayoutTag_Stereo

        // - a standard stereo stream (L R) - implied playback

        layout->mChannelLayoutTag = kMyAudioChannelLayout;

        err = QTSoundDescriptionCreate(

                    &asbd,              // format description

                    layout, layoutSize, // channel layout

                    NULL, 0,            // magic cookie (compression parameters)

                    kQTSoundDescriptionKind_Movie_LowestPossibleVersion,

                    outDescHndl); // SoundDescriptionHandle returned here

        free(layout);

    return err;

/*

createMovieFromAudioData

Create a movie with a sound track containing the specified

audio data.

inAudioData - pointer to your audio data

inAudioDataSize - size of your audio data

outMovie - pointer to the resulting movie file

*/

-(OSErr) createMovieFromAudioData:(const void *)inAudioData

                                  dataSize:(long)inAudioDataSize

                                  movie:(Movie *)outMovie

    assert(inAudioData != NULL);

    assert(inAudioDataSize != 0);

    assert(outMovie != nil);

    *outMovie = NULL;

    // create an empty movie to which we'll add out audio data

    // as a sound track

    *outMovie = NewMovie(0);

    if (*outMovie == NULL) goto bail;

    SoundDescriptionHandle hSoundDesc = NULL;

    // Create a sound description for our audio data

    OSErr err = [self createSoundDescription:&hSoundDesc];

    if (err != noErr) goto bail;

    Track track = NULL;

    // create a movie track to hold our sound media

    track = NewMovieTrack(*outMovie, 0, 0, kFullVolume);

    err = GetMoviesError();

    if (err != noErr) goto bail;

    // create a data reference for storage to hold our media

    // data, because when you create an "empty" movie with

    // NewMovie() there is no designated storage for the movie

    // media.

    Handle dataRef = nil;

    Handle hMovieData = NewHandle(0);

    err = PtrToHand( &hMovieData, &dataRef, sizeof(Handle));

    if (err != noErr) goto bail;

    // get the sample rate value for our data from the asbd so

    // we can use it when creating our track media

    AudioStreamBasicDescription asbd = {0};

    OSStatus status = QTSoundDescriptionGetProperty (

                hSoundDesc,

                kQTPropertyClass_SoundDescription,

                kQTSoundDescriptionPropertyID_AudioStreamBasicDescription,

                sizeof(asbd), &asbd, NULL);

    if (status != 0) goto bail;

    Media media = NULL;

    // create a media for our new track, well add our audio

    // samples to this media

    media = NewTrackMedia(track, SoundMediaType,

                asbd.mSampleRate, // media time scale

                dataRef, HandleDataHandlerSubType); // movie data reference

    err = GetMoviesError();

    if (err != noErr) goto bail;

    err = BeginMediaEdits(media);

    if (err != noErr) goto bail;

    // Add sample data and sample description for our audio data

    // to the track media.

    err = AddMediaSample2 (media,

                           inAudioData, // ptr to our audio data

                           inAudioDataSize, // audio data size

/*

                              decodeDurationPerSample

                              The duration of each sample to be added,

                              representing the amount of time (in the

                              media's time scale) that passes while

                              the sample data is being displayed. Since

                              we are adding sound that was sampled at

                              22 kHz to media that contains a sound track

                              with the same time scale we set

                              durationPerSample to 1.

                              In CoreAudio, sample = frame. A frame is

                              an individually accessible uncompressed

                              pcm sample of data. When dealing with PCM,

                              1 packet = 1 frame. But for compressed

                              formats, 1 packet often equals a lot of frames.

                              For instance, 1 AAC packet = 1024 frames.

*/

                           kSoundSampleDuration, // duration per sample = 1

0,

                           (SampleDescriptionHandle)hSoundDesc,

                           kNumSamples,

                           0, // 0 = no flags

                           nil);

    EndMediaEdits(media);

    if (err != noErr) goto bail;

    // Insert a reference to the media segment into the track.

    err = InsertMediaIntoTrack(track,

                            kTrackStart,    // track start time

                            kMediaStart,    // media start time

                            GetMediaDuration(media),

                            fixed1);

bail:

    if (hSoundDesc != NULL)

        DisposeHandle((Handle)hSoundDesc);

    if (err != noErr)

        if (*outMovie != NULL)

            DisposeMovie(*outMovie);

        if (hMovieData != NULL)

            DisposeHandle(hMovieData);

    return err;

References

Document Revision History

Date	Notes
2009-08-27	Editorial
2007-08-29	New document that how to create a QuickTime movie from PCM audio samples in memory

Retired Document

How do I create a QuickTime movie from PCM audio samples in memory?

Q: I'm trying to create a QuickTime movie from a memory buffer of PCM audio samples (Stereo, 22.050 kHz) but I'm not having any luck. When I play the resulting movie all I get is silence. Also, how do I properly fill out a SoundDescription structure to describe my audio?

References

Document Revision History

Q: I'm trying to create a QuickTime movie from a memory buffer of PCM audio samples (Stereo, 22.050 kHz) but I'm not having any luck. When I play the resulting movie all I get is silence. Also, how do I properly fill out a `SoundDescription` structure to describe my audio?