Apple Developer Connection
Member Login Log In | Not a Member? Contact ADC

< Previous PageNext Page > Hide TOC

QuickTime Movies

When working with the QuickTime API, nearly all operations are performed on a data structure known as a QuickTime movie. The QuickTime movie is a description of a multimedia presentation. It tells a computer (or other multimedia-capable device):

You can use QuickTime movies in several ways:

For example, to play an MP3 audio file using QuickTime, you create a new movie in memory from the MP3 file and play the movie. This does not directly copy the MP3 audio data into memory; it creates a small movie data structure that allows QuickTime to find, decompress, and play the audio data in the MP3 file.

A QuickTime movie does not contain sample data, such as audio samples or video frames. A movie is the organizing principle that allows a computer to locate and interpret the required sample data. Playing a movie causes QuickTime to locate and obtain sample data from wherever it is, decompress and composite it as necessary, and present it in the proper sequence and arrangement.


Figure 1-5  Sample data resides outside of the movie

Sample data resides outside of the movie

High-level QuickTime operations, such as opening and playing movies, can often be performed with no need to delve into any details of a particular movie, such as what kind of media are presented, how the media are compressed, or where the data samples are stored. Still, a basic understanding of QuickTime movie structure is useful for any QuickTime programmer and is essential for lower-level operations.

In this section:

Movies and Movie Files
Tracks
Samples
Sample Duration and Frame Rate
Time
Linear and Nonlinear Media and Movies


Movies and Movie Files

It’s important to distinguish between a QuickTime movie, the data structure we have been discussing, and a QuickTime movie file. A movie is not the same as a movie file.

A movie file can contain a stored copy of a movie data structure, or it can contain only a reference to such a structure, stored somewhere else.

If a movie file contains a stored movie, it can optionally contain the sample data used by the movie as well. This is sometimes called a self-contained movie file, and it is quite common. When the sample data is stored in a movie file, it is interleaved for smooth playback.


Figure 1-6  Three kinds of movie files

Three kinds of movie files

In casual use, a QuickTime movie file is sometimes simply called a movie. Similarly, a reference movie file may be called a reference movie, and a self-contained movie file may be called a self-contained movie. But in the QuickTime API documentation, the word “movie” always refers to a movie data structure, not a movie file. It is sometimes useful to think of a movie in memory as an instance of a movie stored in a file.

The copy of a movie stored in a movie file is sometimes referred to as a movie resource to distinguish it from a movie in memory.

Note: In early versions of QuickTime, the movie data structure was stored in the resource fork of Mac OS files; hence the name “movie resource.” This is no longer the case, but the name remains, and the distinction is sometimes useful.

Tracks

A QuickTime movie is organized into tracks. A movie can contain many tracks; there are practical limits, which change as computers become more powerful, but there is no predefined limit.

Track Media, Compression, and Data References

Each track specifies a media type—such as video, sound, or text—and a data reference that specifies where the sample data for that track can be found. A track may also specify a compression format (such as JPEG video or GSM audio).

The data reference may be to a local file, a file on a network or Internet server, a data stream from a network or Internet server, or a handle or pointer to a block of memory; other data reference types are also possible and the type itself is extensible. Simply put, the movie data can be anywhere. A data reference identifies the data source.

Different tracks can specify the same data source or different data sources. All the movie’s media samples can be in a single file, for example, or the samples for a movie’s sound track can be in one file while the samples for the video track are stored in a different file.


Figure 1-7  Movies can use data from multiple sources

Movies can use data from multiple sources

A given track can specify only one media type, and most tracks get all of their samples from a single source. Some media types support multiple sources, however. For example, a video track can consist a series of JPEG images, each stored in a separate file. In this case, there is a data reference for every image.

Different tracks can be of the same media type or of different media types—you can have multiple video tracks and multiple sound tracks in the same movie, for example, or multiple text tracks in different languages.

A given track can use only one type of compression, but multiple tracks of the same media type may be compressed differently in the same movie. For example, a single movie can contain both MP3 and MPEG-4 compressed audio tracks.

Track Visual and Sound Characteristics

Visual tracks have properties such as height, width, x and y offsets, layer numbers, and graphics modes. This allows you to play multiple visual tracks at the same time: side by side, partly or completely overlapping, and with various degrees of transparency or translucence. Visual tracks also contain a transformation matrix that can be used for rotating, scaling, or skewing the visual output of a track at runtime. QuickTime provides automatic clipping of images at the track boundary, and can have an associated mask, or matte, for cropping the output using arbitrary shapes.

Sound tracks have properties such as volume and balance, allowing you to create layers of sound. Multichannel sound formats, such as four-speaker and 5.1 surround sound, are supported in QuickTime 7 and later.

Track Media

A track also contains a data structure known as a media. This is a low-level data structure that describes the location, duration, and natural time scale of the media sample data. This can be confusing, because in casual use the sample data itself is sometimes referred to as the track’s media.

Important: Try not to confuse the media data structure with the data samples themselves.

When a QuickTime function or data type specifies a media as a parameter or field, it always refers to the media data structure inside a movie, not to actual data samples.

Media Time Scale

A QuickTime movie always has a time scale, expressed in units per second. You can specify a time scale when you create a movie, but the time scale cannot be changed once a movie exists. When you perform operations on a QuickTime movie, you frequently need to specify a point in the movie timeline at which to begin the operation; this is specified using a time value, expressed in movie time scale units. You may also need to specify a duration; this is also expressed in movie time scale units.

The default movie time scale is 600, so to advance a movie to a point 2 seconds into its duration, you would typically go to time 1200. Similarly, a duration of 1/30th of a second would be 20 time scale units. You can obtain the movie time scale by calling GetMovieTimeScale.

Tracks use the time scale of their parent movie. Time values and durations for all track operations are expressed in movie time scale units.

Each track’s media, however, has its own time scale, which is typically the sample rate of the track’s media data. For example, a track containing NTSC video might have a time scale of 30, while a track containing PAL video would have a time scale of 25, and a track containing CD audio would have a time scale of 44100. This allows you to conveniently refer to individual media data samples, increment through a group of samples, and so on.

Operations on individual media samples typically use times and durations expressed in the media time scale. You can obtain the media time scale for a given track by calling GetMediaTimeScale.

There are utility functions for translating between track time (which is also movie time) and media time. There are also numerous functions that allow you to translate between the time domain (time and duration) and the sample domain (sample number and number of samples).

Track Edit List

Each track contains an edit list, which allows you to alter or reorder the display of media samples without changing or rearranging the samples themselves. This results in nondestructive editing. You can “edit out” a track segment without deleting any samples from the data source, or repeat a segment without increasing the size of the data source with duplicate samples.

You can also use the edit list to alter the duration of a media segment, causing it to play back faster or more slowly than it normally would, or insert an “empty” track segment that contains no data for a period of time. In other words, any segment of media time, including an empty segment, can be mapped to any segment of track time.

If a track has not been edited, the edit list is empty and the track is treated as a single segment, with all the media samples played in the order they are stored, at the natural time scale for the media sample data.

Track Duration

Each track has a duration, which is the combined duration of all segments in its edit list (typically the combined duration of all of its samples), including any “empty” segments.

Similarly, each movie has a duration, which is simply the duration of its longest track.

Note: Technically, it is possible to offset the beginning of a track from the beginning of a movie by a fixed amount of time. In practice, this is rarely done; instead, an “empty” segment is inserted at the beginning of the track’s edit list, so that the first sample is displayed after a fixed amount of time.

Samples

At the lowest level, a QuickTime track contains a set of sample tables. Each entry in a sample table specifies the location and duration of a chunk of sample data, such as a still image, a video frame, a sequence of PCM audio samples, or a text string.

There is at least one sample description for each table of samples. The sample description provides the details necessary to translate a stored sample into a format that the media handler can work with. For example, a sample description might specify the height, width, and pixel format of an image, or the sample size and sampling rate of a group of PCM audio samples.

For some media types, such as sound, all data samples in a given track share a single sample description. If you have audio samples that use different sample rates or sample sizes, for example, they must be in separate sound tracks.

Other media types can have multiple sample descriptions, so a series of images could have varying heights and widths, with different sample descriptions used whenever the dimensions change.

Sample Duration and Frame Rate

Because each chunk of sample data has its own duration, and a chunk can be as small as a single sample, a QuickTime track may not have any fixed “frame rate.” A video track might consist of a series of images that act as a slideshow, for example, with each “slide” on screen for a different length of time.

This can be very difficult to grasp if you are used to working in media with fixed frame rates, but it is a powerful feature of QuickTime. A fixed frame rate would require images to be repeated periodically, perhaps many times, to display them on screen for an extended period; in QuickTime, each image can be stored as a single sample with its own unique duration.

By extension, a QuickTime movie does not necessarily have a fixed frame rate. A 25-fps PAL video track may play side by side with a 30-fps NTSC video track in the same movie, for example, perhaps with both tracks composited on top of a still image that is displayed for the entire duration of the movie, or on top of a “slideshow” track that changes at irregular intervals. This is possible because the display is created at runtime by a programmable device, not mechanically projected by display hardware.

Of course, a QuickTime track, or a QuickTime movie, may have a frame rate; it is very common for a video track to contain a series of samples that all have the same duration, and it is also common for a movie to have a single video track with a constant sample rate. But it is not a requirement.

You can always compute a frame rate by dividing the duration of a track by the total number of video samples, but be aware that the results of this calculation are not always predictive of the movie’s behavior; the actual frame rate could change abruptly at several points during the movie.

Time

As noted in the discussion of tracks, a movie has a time scale, as does the media for each track. A time scale specifies some number of units per second. For a media, the time scale is usually the sample rate. For a track or a movie, the time scale can be any convenient number (the track time scale is the same as the movie time scale).

Note: The default time scale for a movie is 600 units per second. You can specify a time scale when you create a movie or add a blank media to a new track.

The relationship among the movie’s time scale and the time scales of the various media define the movie’s time-coordinate system. QuickTime uses the movie’s time-coordinate system to synchronize all the tracks and media to the movie timeline.

A movie always has a current time, which designates what parts of the movie should be presented immediately. The current time is expressed in movie time-scale units. For example, if the movie time-scale is 600, and the movie has been playing for half a second, the current time is 300.

The current time can range from 0 to the movie’s duration. Current time changes as the movie plays. Dragging the playhead in a movie controller changes the current time in the movie.

A movie also has a rate, which is 0 when the movie is stopped and 1 when the movie is playing at its normal speed, which is defined by the movie time-scale. For example, a movie with a time-scale of 600 plays at 600 units per second when the rate is 1. Negative rates cause the movie to play backward. Rates greater or less than 1 cause the movie to play faster or slower than normal. For example, a movie with a time-scale of 600 plays at 300 units per second when the rate is 0.5, and at 1200 units per second when the rate is 2.

QuickTime establishes a playback time base when a movie is run. The time base consists of the movie’s time-coordinate system, a rate, a current time, and a reference to a clock component that provides QuickTime with measurements of real time. This allows QuickTime to play a movie at the correct number of time-scale units per second for the current rate in real time.

This also allows QuickTime to “drop frames” appropriately if the data rate of the movie exceeds the capability of the playback device, so tracks remain synchronized with each other and with real time specifications (for example, a one-minute movie plays in exactly one minute, even if the playback device cannot decompress all of the movie’s video frames in that length of time).

Linear and Nonlinear Media and Movies

Linear media, such as a series of consecutive video frames, are tied to the movie timeline; they change in a fixed manner as the current time changes, varying in tempo and direction with the movie’s rate.

QuickTime also supports nonlinear media, such as a bouncing sprite, whose actions can be specified with respect to the passage of real time, or with respect to user actions such as mouse clicks. These actions can continue even when the movie is paused (has a rate of 0). This makes it possible to embed customized controls in a movie that respond to user interaction.

Movies that normally play at a fixed rate are called linear movies and typically feature a controller with a play/pause button and a time-slider. Movies that are nonlinear may feature a different type of controller or no controller at all.

For example, a VR panorama is usually controlled by a special VR controller that changes the image in response to the keyboard and mouse. A VR movie normally has a rate of 0, because it consists of a still image that the user can interact with. The VR image is nonlinear; it does not change in a fixed manner during the movie timeline, but in response to unpredictable user actions.

Nonlinear movies can use the movie timeline to separate distinct behaviors. For example, if a panorama has multiple nodes, each node is located at a different point on the movie timeline to keep them from displaying simultaneously; jumping to a new node involves changing the current time, typically while leaving the rate at 0.

It is possible to mix linear and nonlinear media in the same movie. To add sound to a VR panorama, for example, the duration of the VR image is extended to match the duration of the sound track. When the rate is nonzero, the sound plays. The display of the panorama remains nonlinear, however; it changes when the user interacts with it, without regard to the movie’s current time or rate (as long as the current movie time is within the VR image’s duration). If the movie is paused, for example, the sound stops playing but the VR image remains interactive.

When mixing linear and nonlinear media, it is sometimes necessary to create custom movie controls. For example, the VR controller has no play/pause button to start and stop a sound track. You can control the movie rate programatically from your application or add an interactive sprite to the movie, such as a play/pause button, to provide user control.



< Previous PageNext Page > Hide TOC


Last updated: 2005-08-11




Did this document help you?
Yes: Tell us what works for you.

It’s good, but: Report typos, inaccuracies, and so forth.

It wasn’t helpful: Tell us what would have helped.
Get information on Apple products.
Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Copyright © 2007 Apple Inc.
All rights reserved. | Terms of use | Privacy Notice