Technical: QuickTime
Advanced Search
Apple Developer Connection
Member Login Log In | Not a Member? Contact ADC

title


Previous Section Table of Contents Next Section


Movies

Contents

The Movie

Movie Time - Time Coordinate System

Time Scale

Media Time Scale

Playback Time Base

Movie Structure

Tracks and Media

Media References

Data References

Graphic Transformations

Clipping

Matrix

Layering, Track Enabling and Quality

Graphic Modes

Sound Tracks

Volume

Balance

Reference


The Movie

QuickTime uses the metaphor of a movie to describe time-based data. Any time-based data can be organized as a movie (audio, video or both). Movies are containers that hold all of the information needed to organize data in time, but not the data itself.

The term container is not a specific or special term in QuickTime, but is used in the general sense suggesting a place where data is stored and accessed.

 

Movies are made up of data streams called Tracks and each track references and organizes a sequence of data of the same type in a time-ordered way. Tracks contain Media Structures that reference the actual data. Media is organized within a track and chunks of media data are called media samples.

A movie file typically contains a movie structure and its media bundled together so you can download or transport everything together. Movies can also contain references to media not stored locally, for example a URL. You can combine these together, having some data references to local media and other references to media stored elsewhere.

Movie Time - Time Coordinate System

QuickTime movies organize media along the time dimension. To manage this time dimension QuickTime defines a time coordinate system. This coordinate system locks movies and media data structures to a common measurement, the second. Each time coordinate system establishes a time scale and this time scale establishes the translation between real time and movie time. Time scales are marked in time units of so many units per second.

The time coordinate system also defines duration. This is the length of a movie or media structure in terms of time units. Therefore, a particular time in a movie can be identified by the number of time units elapsed to that point.

Each track in a movie contains a time offset and duration. These attributes determine when a track begins playing and for how long. Each media structure also has its own time scale, which determines the default time units for data samples of that media type.

Time Scale

The time scale is the master ruler for a movie. Every event within a movie is measured and located by the movies time scale and this time scale is expressed in so many units per second. The duration of a movie element is the number of time scale units from its beginning to its end. Each track in a movie also has an offset that is specified in time scale units. This offset is the beginning point for the track. A track that begins when the movie begins has an offset of 0.

Every movie has a time coordinate system, but these systems may differ from movie to movie.

The time scale in a movies time coordinate system should be a convenient number of fractions of a second. The scale number should be one that makes it easy to translate movie times into other time scales.

A movie time scale of 600 (the default time scale for a newly created movie) can translate a playback rate of 24 frames per second. With a time scale of 600 and a playback rate of 24 frames per second, each frame would be displayed for 25 time scale units (600 / 24 frames = 25). In this same way 600 works well for 30, 50 and 60 frames per second. Media with high sample rates (such as digital audio, which can have a sample rate such as 44,100) are specified as so many samples per time scale unit.

A movie's duration is therefore the total number of its time scale units from end to end. A track in a movie can have smaller duration if it doesn't extend to the end of the movie, and media chunks that tracks reference usually have even shorter duration's.

Media Time Scale

Each chunk of media that is referenced by a track has its own time scale determined by its sample rate. QuickTime will automatically translate between a movie's time scale and the time scales of its various media; an application doesn't need to do this.

For example, consider a movie containing a single video track, a single audio track and a text track. The movie has a duration of 2 seconds and a default time scale of 600. The video track starts at the beginning of the movie (offset 0) and goes to the end (duration 1200), at 30 frames per second (a media time scale of 30). The audio track starts at the beginning and goes to the end (offset 0, duration 1200), at 44,100 samples per second (media time scale of 44100). The text track contains one single sample, the title that appears for 0.25 seconds into the movie and lasts for 1 second (offset 150, duration 600, media time scale 1).

If an edit action is performed in the middle of the movie (for example a copy operation) 0.5 seconds in duration (300 in movie time scale), QuickTime would correctly copy each track using the appropriate medial time scale of 15 video frames, 22,050 audio samples and 1 text sample.

Playback Time Base

QuickTime establishes the playback time base when a movie is run. It consists of the movie's time coordinate system, a rate, a concept of current time, and a reference to a clock component that provides QuickTime with measurements of real time.

Rate determines how many time scale units actually run per real-time unit. The rate also determines which way a movie plays; forward or backward. If the rate value is negative, the movie runs backwards. If a movie with a time scale of 600 has a rate set to 1, QuickTime will process 600 scale units of the movie each second going forward, and the movie will run at normal speed. A rate of 0.5 would result in 300 scale units of the movie being processed every second and the movie will play at half speed. A rate of -1.0 means the movie will play backwards at normal speed.

Current time is simply the location expressed in the movies time scale units where the movie is currently playing. The value of current time can range from 0 to the movie's duration. Current time changes as the movie runs.

The above diagram outlines a movie with a time scale of 600 running at a Rate of 0.5. The movies duration is 1200 and contains a transition between two tracks (Graphic A and Graphic B) at time 600.

Because the movie's duration is 1200 with a time scale of 600 it would normally take 2 seconds to play (Rate = 1). However, the playback time base of the example is 0.5, so the movie is running at half speed and will play for 4 seconds.

The current time shown is 600, so the transition from Graphic A to Graphic B is in progress. Because the playback rate is 0.5, it has taken 2 real-time seconds to reach this current time. The viewer has seen 2 seconds of A and will now see 2 seconds of B.

Movie Structure

Movies act as QuickTime's central data structure and as this concept evolved, the movie structure acquired a number of basic elements:

  • There is only one movie structure with one set of software tools to create, edit, and run movies.
  • The movie is a container and does not contain any media. Instead, movies contain references, which tell QuickTime how to find chunks of media elsewhere.
  • These references are organized according to a master timeline.
  • Movies may contain multiple tracks. Each track can access and present a specific kind of media. Therefore, movies can present several different media chunks at the same time.

A movie contains general information about handling its media (such as decompression parameters) and QuickTime uses this information as a basis for selecting the most effective processing technique.

Movies also contain fields that perform global housekeeping roles for the whole movie.

A movies global information includes:

  • Creation date and time and last modification date and time.
  • Movies time scale and duration.
  • The current time - This changes as the movie plays. When a movie is saved to a movie file the current time is also saved and the movie will start playing where it left off the next time it's viewed.
  • Selection information - This information is saved from the last time the movie was edited. Selection information consists of a start time and duration for the current selection and is expressed in movie time scale units.
  • Active movie segment - This is defined by a start time and a duration in movie time scale units. The default active segment is the entire movie, but an application can set it to some other smaller region if limiting the movies playback is required. QuickTime will not process the movie outside the active segment unless instructed to do so.
  • Movies preview start time and duration - You can set the preview to be any short segment of the movie.
  • The current time that defines the movies poster - This is a "frame shot" from a movie to be used as a poster. A movie poster is used as its default preview if a preview has not been set.
  • The transformation matrix and clipping region for the movie - These parameters control how QuickTime displays the visual elements in a movie.
  • The preferred rate and preferred volume for movie playback - QuickTime will use these parameters unless overridden by an application.
  • A general area for user data - This data can be any mixture of text and binary information. You can use this field to store copyright, credits etc.

QuickTime movies and tracks are private data structures; you don't access their fields directly.

Tracks and Media

The audiovisual parts of a QuickTime movie are its tracks. Each enabled track tells QuickTime how to fetch and present chunks of digital media.

Tracks are a familiar concept from audio technology where tracks can be mixed and overlapped to create different sound sequences. These sound sequences can then create a single sound output. QuickTime uses tracks in movies to combine data of many different types into a single user experience.

A movie may contain many tracks, each track accessing one media type. The type of media determines the type of track, which can range over all the data types QuickTime handles.

All the tracks in a movie use the movie's time coordinate system. The media the tracks access may use various time scales. For example, a video track may access 30 frames a second while an audio track may access 22,050 samples per second. QuickTime takes care of the coordination between the media time scale and the movie time scale.

Each track begins at the beginning of the movie, but the tracks data might not begin until some time other than 0. This is called the track offset. The initial blank time is represented in an audio track by silence and in a video track by no image.

Tracks have a duration and each track may have a different duration. Some of these duration's may be shorter than the movie's duration. A movie's duration will always equal the duration of the longest track.

Tracks contain a list of references, which identify portions of the media the track uses. These references are called edit lists. This allows flexible access to media; a track can play the underlying media data in any order and any number of times.

Tracks also store several fields of housekeeping information specific to the track itself.

This information includes:

  • Creation date and time and last modification date and time.
  • A permanently assigned Track ID - an application can use this to locate the track.
  • An optional alternate group ID - can be used to construct a movie with alternate tracks of the same media type. For example, multiple sound tracks in a different language. Giving these tracks a different track ID but the same alternate group ID allows QuickTime to easily relate tracks together.
  • Visual presentation information if the track is for visual media - This consists of width, height, clipping region and transformation matrix.
  • Volume and stereo balance values if the track is an audio track.
  • User data, such as a user-readable name for the track.

Media References

As with movies and tracks, QuickTime media are referenced by private data structures and QuickTime provides access to functions which let you work with this information.

QuickTime stores the following information in each media reference:

  • Creation date and time and last modification date and time.
  • Media's time coordinate system, time scale and duration - this is usually different from the time coordinate system of the movie that plays the media. An example of this would be a movie with a time scale of 600 including video media with a time scale of 30 fps. QuickTime will automatically keep the two systems in sync.
  • Media handler specification. A reference to the QuickTime component that must be used to interpret and display the media content.
  • Media information - including a specification of where the content is stored, what type it is, how it is compressed.
  • Language and playback quality.
  • User data. A general area for user data such as copyright information and credits.

Media references are small as they only point to the media which will actually be used to play and are included in the movie structure. To keep everything manageable when a movie and all its media are packed together into a single movie file, the media is usually compressed.

QuickTime will not only find and interpret media chunks but will also decompress them on the fly.

Data References

Movies obtain the actual media they play from several sources outside the movie structure.

The most common of these sources are:

  • Macintosh or Windows files on disk or CD-ROM.
  • Network sources, accessed through URLs.

Movies access media through QuickTime data references. When a movie accesses a chunk of media, QuickTime uses a function such as GetDataHandler to return a data handler component that can interpret the chunk.

All data handlers identify their data with data references. Data references specify the location of the data and which data handler is able to interpret the data. A data reference consists of the following two components:

Handle dataRef

OSType dataHandlerSubType

The dataRef value specifies the actual data reference. This is a handle to the information that identifies the data to be used. The type of information stored in the handle depends on the data reference type (dataHandlerSubType).

For example, if your application is loading a movie from a handle, this data reference handle would contain a handle to the movie data.

The dataHandlerSubType value specifies the type of data reference and can be one of the Macintosh OSTypes through which media can be accessed. The type of information stored in the handle depends on this data reference type. For example, for an alias data reference you set this to rAliasType ('alis') , indicating that the reference is an alias.

Listed below are all the available data reference types:

'alis' (rAliasType) Data reference is a Macintosh alias handle. An alias handle contains information about the file. For more information refer to the references provided at the end of this section.

'hndl' (HandleDataHandlerSubType) Data reference is a Macintosh handle whose data is the handle containing the data. In addition to specifying movie data in memory, this handle may contain data reference extensions containing the file type, file name, MIME type etc. as described in Technote 1195. For more information refer to the reference provided at the end of this section.

'rsrc' (ResourceDataHandlerSubType) Data reference is a Macintosh alias handle. However, appended to the end of the alias handle is the resource type (stored as a OSType) and ID (stored as a 16-bit signed integer) that identifies the resource within the specified file. Both these values must be big-endian format.

'url ' (URLDataHandlerSubType) Data reference is a handle whose data is a C-string (null-terminated) specifying a URL. A 'url' data reference can have data reference extensions too, so the actual size of a url data reference may be larger than the length of the string. QuickTime supports the 'file', 'http', 'ftp' URL types as well as the 'rtsp' type in special cases when opening movies.

Graphic Transformations

When a movie containing visual data is played, the movie's data is gathered from the appropriate tracks and media structures and as this data is passed though QuickTime on its way to the outside world certain actions may be performed on these chunks of data before they are finally presented to the user.

Clipping

Clipping is the process of limiting the spacial boundaries of an image before it's presented to the user. QuickTime lets you clip each graphic track separately, then clip the movie as a whole.

An image in a video or graphics track has a track rectangle, defined by a track coordinate system anchored to the upper-left corner of the track's image. When you apply a track clipping region to the track rectangle, their intersection becomes the track boundary region. This is the portion of the tracks image the viewer will see.

Movies as a whole can also be clipped. The movie boundary region is the union of all the clipped track regions. It is the total space needed to contain any track image. By applying a movie clipping region this space can be cut back to the clipped movie boundary region. These spaces are all defined in terms of the movie coordinate system.

Matrix

Matrix transformations allow you to slide images around, shrink or rotate them and translate their coordinates from one system to another. A transformation matrix defines how to map points from one coordinate system into another coordinate system.

QuickTime uses nine-element matrices to translate among coordinate systems. The diagram below is referred to as the identity matrix and performs no transformations.

Each matrix defines how measurements in one system must be calculated to become measurements in another system. The movie matrix for example translates between the movie coordinate system and the display coordinate system. The display coordinate system is the same as that of the graphics world in which the movie is played.

Matrix operations also let you slide, stretch, shrink and rotate images. When matrix operations are combined with clipping operations they give you complete control over how graphics and video images are displayed by QuickTime movies.

Transformation matrices used by the Movie Toolbox contain the following data types:

[0] [0] Fixed [1] [0] Fixed [2] [0] Fract
[0] [1] Fixed [1] [1] Fixed [2] [1] Frac
[0] [2] Fixed [1] [2] Fixed [2] [2] Fract

A common pitfall when first using some QuickTime APIs is to pass integers or floats to Fixed parameters, or use them in MatrixRecords. The Fixed format has 16 bits to the left and 16 bits to the right of the binary point. Fixed format numbers range from -32,768 to approximately 32,768. The value of 1.0 Fixed is represented by the integer value of 65536 [0x00010000].

For more information regarding transformations refer to the reference provided at the end of this section.

Layering, Track Enabling and Quality

Layering refers to how tracks can overlap one another, layering several images to produce what the user sees. Think of the tracks in a movie as if they were painted on overlapped layers of film. What is presented to the user is the sum of the contents in all the tracks that are enabled at any given instant. Track layers are numbered from -32,768 through 32,767. Lower numbered layers are shown on top of higher-numbered layers. This layering order is important mainly when multiple video tracks are combined though graphics modes.

QuickTime movies can contain many tracks or a series of grouped alternate tracks which may be enabled or disabled as required to present the user with the desired experience.

By enabling one track out of a group, you can present different levels of quality in a movie or select from multiple content. You could layer an 8-bit sound track with an English dialog track or enable a higher quality 16-bit audio track with a Spanish dialog track.

QuickTime lets you store quality information for media in an 8-bit quality setting that is part of the media's description structure. Bits 6 and 7 of the setting encode a relative quality number ranging from 0 to 3 where higher quality values indicate larger sample sizes. You could use this functionality if for example a movie contained two alternate sound tracks, one referencing 8-bit sound media and the other referencing 16-bit sound media. The 8-bit media would be assigned a quality value of mediaQualityNormal (0x0040) and the 16-bit media would be assigned a quality value of mediaQualityBetter (0x0080). QuickTime would play the 16-bit media only if the user's configuation could handle 16-bit sound; otherwise it would play the 8-bit media.

Additionally QuickTime uses the low-order bits 0 through 5 of the quality setting to indicate pixel depths at which the media should be played. The quality bits correspond to depths of 1, 2, 4, 8, 16, and 32 bits of data.

Graphics Modes

When images are overlapped, QuickTime must decide for each pair of pixels (one in the bottom image and one in the top) what pixel value the user should see. QuickTime bases its decision on the graphics mode. QuickTime supports all the Macintosh QuickDraw transfer modes plus four alpha channel modes of its own.

Commonly used modes:

  • copy
  • blend
  • dither
  • transparent
  • alpha
  • alpha blend
  • premultiplied white alpha
  • premultiplied black alpha

Copy Mode - The default mode for still images and video layers, in this mode QuickTime simply displays the top image and ignores the lower layers.

Blend Mode Ð This mode averages the numeric values of each color for each corresponding pixel in two overlapped images. You can move the average toward one layer or the other by applying weighting factors.

Dither Mode Ð You can use this mode to approximate more colors than are available on the screen. For example, you could dither a 32-bit image on an 8-bit display. Dithering can produce a grainy image and will increase rendering time.

Transparent Mode Ð This mode replaces a pixel in the lower layer with the overlapping pixel from the upper layer, but only if the upper-layer pixel's color is not equal to a specified background color. This mode can be used to perform matte or blue screen techniques.

Alpha Modes Ð An alpha value in the color description of an image defines the image's opacity. Opacity is the degree to which an image obscures images in lower layers. An alpha value of 1 (0xFF for 32-bit color) means the image covers the low one completely. 0 would mean the image is invisible and the lower layer shows completely through. By changing the alpha value you can make the image fade in and fade out over a background without affecting its coloring.

Alpha values may also be used to blend two images. The alpha blending mode will use the alpha value of the first image to determine how much of it appears in the resulting image and will use 1 minus this alpha value to determine how much of the second image appears in the resulting image. The result of blending produces an image with a total opacity of 1, but it constitutes a blend of two semi-opaque images.

QuickTime can also perform premultiplication by either white or black backgrounds. Pre-multiplied with white means that the color components of each pixel have already been blended with a white pixel, based on their alpha channel value. What this means is the image has already been combined with a white background. To combine the image with a different background color, QuickTime must first remove the white from each pixel and then blend the image with the actual background pixels. Images are often pre-multipled with white as this reduces the appearance of jagged edges around objects.

Pre-multipled with black is the same as pre-multipled with white, except the background color that the image has been blended with is black instead of white.

Although you pass these new alpha channel graphics modes to QuickTime in the same way as you would traditional QuickDraw transfer modes, these modes are not supported by QuickDraw and will cause unpredictable results if passed to QuickDraw routines.

Sound Tracks

QuickTime movies may have multiple sound tracks which can be layered and enabled just like video and graphics tracks. QuickTime will mix separate music and voice tracks to accompany video, or you can enable one track out of a set of alternate tracks recorded in different languages. As with graphics, sound media may be transformed by manipulating their track properties.

Volume

Every QuickTime movie stores a preferred volume value for the whole movie. Each audio track also stores a preferred volume value for the track. Track volume allows you to adjust the loudness of one track relative to another while movie volume allows you to specify the loudness of all the tracks mixed together. When a movie is loaded, QuickTime sets its current track and movie volumes to the preferred values of the movie.

Volumes in QuickTime are represented as a 16-bit fixed value. This value has a range of Ð1.0 to +1.0. The high-order 8 bits contain the integer portion of the value and the low-order 8 bits contain the fractional part. Positive values denote volume settings with 1.0 corresponding to the maximum volume on the user's computer. Negative values are silent but retain their magnitude. By toggling the sign of the volume setting, you can very easily perform a mute function and turn off the sound and then turn it back on at its previous level.

Balance

Tracks and media also have their own balance setting. Balance controls the relative levels of sound between a computers left and right speakers. When working with a monaural source, the balance setting controls the loudness of each speaker. With a stereo source the balance setting governs the relative emphasis of the right and left channels. When a movie is saved, these setting are preserved and stored in the movie file.

The balance values are represented by a 16-bit fixed-point number ranging from Ð1.0 to +1.0. The high-order 8 bits contain the integer portion of the value and the low-order 8 bits contain the fractional part. Negative values move the balance control to the left while positive values move the balance to the right.

References

Movie Toolbox

Supported QuickTime Media Types

Transformation Matrix

Inside Macintosh: Files

Inside Macintosh: Memory

Technote 1195 : Tagged Handle Data References in QuickTime 4

Introduction to Sound

 



Previous Section Table of Contents Next Section