Apple Developer Connection
Member Login Log In | Not a Member? Contact ADC

< Previous PageNext Page > Hide TOC

Video Media

Video media is used to store compressed and uncompressed image data in QuickTime movies. It has a media type of 'vide'.

In this section:

Video Sample Description
Video Sample Data


Video Sample Description

The video sample description contains information that defines how to interpret video media data. A video sample description begins with the four fields described in “General Structure of a Sample Description.”

The data format field of a video sample description indicates the type of compression that was used to compress the image data, or the color space representation of uncompressed video data. Table 3-1 shows some of the formats supported. The list is not exhaustive, and is subject to addition.

Table 3-1  Some image compression formats

Compression type

Description

'cvid'

Cinepak

'jpeg'

JPEG

'smc '

Graphics

'rle '

Animation

'rpza'

Apple video

'kpcd'

Kodak Photo CD

'png '

Portable Network Graphics

'mjpa'

Motion-JPEG (format A)

'mjpb'

Motion-JPEG (format B)

'SVQ1'

Sorenson video, version 1

'SVQ3'

Sorenson video 3

'mp4v'

MPEG-4 video

'dvc '

NTSC DV-25 video

'dvcp'

PAL DV-25 video

'gif '

Compuserve Graphics Interchange Format

'h263'

H.263 video

'tiff'

Tagged Image File Format

'raw '

Uncompressed RGB

'2vuY′

Uncompressed Y′CbCr, 8-bit-per-component 4:2:2

'yuv2'

Uncompressed Y′CbCr, 8-bit-per-component 4:2:2

'v308'

Uncompressed Y′CbCr, 8-bit-per-component 4:4:4

'v408'

Uncompressed Y′CbCr, 8-bit-per-component 4:4:4:4

'v216'

Uncompressed Y′CbCr, 10, 12, 14, or 16-bit-per-component 4:2:2

'v410'

Uncompressed Y′CbCr, 10-bit-per-component 4:4:4

'v210'

Uncompressed Y′CbCr, 10-bit-per-component 4:2:2

The video media sample description adds the following fields to the general sample description.

Version

A 16-bit integer indicating the version number of the compressed data. This is set to 0, unless a compressor has changed its data format.

Revision level

A 16-bit integer that must be set to 0.

Vendor

A 32-bit integer that specifies the developer of the compressor that generated the compressed data. Often this field contains 'appl' to indicate Apple Computer, Inc.

Temporal quality

A 32-bit integer containing a value from 0 to 1023 indicating the degree of temporal compression.

Spatial quality

A 32-bit integer containing a value from 0 to 1024 indicating the degree of spatial compression.

Width

A 16-bit integer that specifies the width of the source image in pixels.

Height

A 16-bit integer that specifies the height of the source image in pixels.

Horizontal resolution

A 32-bit fixed-point number containing the horizontal resolution of the image in pixels per inch.

Vertical resolution

A 32-bit fixed-point number containing the vertical resolution of the image in pixels per inch.

Data size

A 32-bit integer that must be set to 0.

Frame count

A 16-bit integer that indicates how many frames of compressed data are stored in each sample. Usually set to 1.

Compressor name

A 32-byte Pascal string containing the name of the compressor that created the image, such as "jpeg".

Depth

A 16-bit integer that indicates the pixel depth of the compressed image. Values of 1, 2, 4, 8 ,16, 24, and 32 indicate the depth of color images. The value 32 should be used only if the image contains an alpha channel. Values of 34, 36, and 40 indicate 2-, 4-, and 8-bit grayscale, respectively, for grayscale images.

Color table ID

A 16-bit integer that identifies which color table to use. If this field is set to –1, the default color table should be used for the specified depth. For all depths below 16 bits per pixel, this indicates a standard Macintosh color table for the specified depth. Depths of 16, 24, and 32 have no color table.

If the color table ID is set to 0, a color table is contained within the sample description itself. The color table immediately follows the color table ID field in the sample description. See “Color Table Atoms” for a complete description of a color table.

Video Sample Description Extensions

Video sample descriptions can be extended by appending other atoms. These atoms are placed after the color table, if one is present. These extensions to the sample description may contain display hints for the decompressor or may simply carry additional information associated with the images. Table 3-2 lists the currently defined extensions to video sample descriptions.

Table 3-2  Video sample description extensions

Extension type

Description

'gama'

A 32-bit fixed-point number indicating the gamma level at which the image was captured. The decompressor can use this value to gamma-correct at display time.

'fiel'

Two 8-bit integers that define field handling. This information is used by applications to modify decompressed image data or by decompressor components to determine field display order. This extension is mandatory for all uncompressed Y′CbCr data formats.The first byte specifies the field count, and may be set to 1 or 2. A value of 1 is used for progressive-scan images; a value of 2 indicates interlaced images. When the field count is 2, the second byte specifies the field ordering: which field contains the topmost scan-line, which field should be displayed earliest, and which is stored first in each sample. Each sample consists of two distinct compressed images, each coding one field: the field with the topmost scan-line, T, and the other field, B. The following defines the permitted variants:0 – There is only one field. 1 – T is displayed earliest, T is stored first in the file. 6 – B is displayed earliest, B is stored first in the file.9 – B is displayed earliest, T is stored first in the file.14 – T is displayed earliest, B is stored first in the file.

'mjqt'

The default quantization table for a Motion-JPEG data stream.

'mjht'

The default Huffman table for a Motion-JPEG data stream.

'esds'

An MPEG-4 elementary stream descriptor atom. This extension is required for MPEG-4 video. For details, see “MPEG-4 Elementary Stream Descriptor ('esds') Atom.”

'pasp'

Pixel aspect ratio. This extension is mandatory for video formats that use non-square pixels. For details, see “Pixel Aspect Ratio ('pasp').”

'colr'

Color parameters—an image description extension required for all uncompressed Y′CbCr video types. For details, see “Color Parameter Atoms ('colr').”

'clap'

Clean aperture—spatial relationship of Y′CbCr components relative to a canonical image center. This allows accurate alignment for compositing of video images captured using different systems. This is a mandatory extension for all uncompressed Y′CbCr data formats. For details, see “Clean Aperture ('clap').”

Pixel Aspect Ratio ('pasp')

This extension specifies the height-to-width ratio of pixels found in the video sample. This is a required extension for MPEG-4 and uncompressed Y′CbCr video formats when non-square pixels are used. It is optional when square pixels are used.

Size

An unsigned 32-bit integer holding the size of the pixel aspect ratio atom.

Type

An unsigned 32-bit field containing the four-character code 'pasp'.

hSpacing

An unsigned 32-bit integer specifying the horizontal spacing of pixels, such as luma sampling instants for Y′CbCr or YUV video.

vSpacing

An unsigned 32-bit integer specifying the vertical spacing of pixels, such as video picture lines.

The units of measure for the hSpacing and vSpacing parameters are not specified, as only the ratio matters. The units of measure for height and width must be the same, however.

Table 3-3  Common pixel aspect ratios

Description

hSpacing

vSpacing

4:3 square pixels (composite NTSC or PAL)

1

1

4:3 non-square 525 (NTSC)

10

11

4:3 non-square 625 (PAL)

59

54

16:9 analog (composite NTSC or PAL)

4

3

16:9 digital 525 (NTSC)

40

33

16:9 digital 625 (PAL)

118

81

1920x1035 HDTV (per SMPTE 260M-1992)

113

118

1920x1035 HDTV (per SMPTE RP 187-1995)

1018

1062

1920x1080 HDTV or 1280x720 HDTV

1

1

MPEG-4 Elementary Stream Descriptor Atom ('esds')

This atom contains an MPEG-4 elementary stream descriptor atom. This is a required extension to the video sample description for MPEG-4 video. This extension appears in video sample descriptions only when the codec type is 'mp4v'.

Note: The elementary stream descriptor which this atom contains is defined in the MPEG-4 specification ISO/IEC FDIS 14496-1.

Size

An unsigned 32-bit integer holding the size of the elementary stream descriptor atom.

Type

An unsigned 32-bit field containing the four-character code 'esds'

Version

An unsigned 8-bit integer set to zero.

Flags

A 24-bit field reserved for flags, currently set to zero.

Elementary Stream Descriptor

An elementary stream descriptor for MPEG-4 video, as defined in the MPEG-4 specification ISO/IEC 14496-1 and subject to the restrictions for storage in MPEG-4 files specified in ISO/IEC 14496-14.

Color Parameter Atoms ('colr')

This atom is a required extension for uncompressed Y′CbCr data formats. The 'colr' extension is used to map the numerical values of pixels in the file to a common representation of color in which images can be correctly compared, combined, and displayed. The common representation is the CIE XYZ tristimulus values (defined in Publication CIE No. 15.2).

Use of a common representation also allows you to correctly map between Y′CbCr and RGB color spaces and to correctly compensate for gamma on different systems.

The 'colr' extension supersedes the previously defined 'gama' Image Description extension. Writers of QuickTime files should never write both into an Image Description, and readers of QuickTime files should ignore 'gama' if 'colr' is present.

The 'colr' extension is designed to work for multiple imaging applications such as video and print. Each application, driven by its own set of historical and economic realities, has its own set of parameters needed to map from pixel values to CIE XYZ.

The CIE XYZ representation is mapped to various stored Y′CbCr formats using a common set of transfer functions and matrixes. The transfer function coefficients and matrix values are stored as indexes into a table of canonical references. This provides support for multiple video systems while limiting the scope of possible values to a set of recognized standards.

The 'colr' atom contains four fields: a color parameter type and three indexes. The indexes are to a table of primaries, a table of transfer function coefficients, and a table of matrixes.


Figure 3-1  Color atom

Color atom

The table of matrixes specifies the matrix used during the translation, as shown in Figure 3-2.

Color parameter type

A 32-bit field containing a four-character code for the color parameter type. The currently defined types are 'nclc' for video, and 'prof' for print. The color parameter type distinguishes between print and video mappings.

If the color parameter type is 'prof', then this field is followed by an ICC profile. This is the color model used by Apple’s ColorSync. The contents of this type are not defined in this document. Contact Apple Computer for more information on the 'prof' type 'colr' extension.

If the color parameter type is 'nclc' then this atom contains the following fields:

Primaries index

A 16-bit unsigned integer containing an index into a table specifying the CIE 1931 xy chromaticity coordinates of the white point and the red, green, and blue primaries. The table of primaries specifies the white point and the red, green, and blue primary color points for a video system.

Transfer function index

A 16-bit unsigned integer containing an index into a table specifying the nonlinear transfer function coefficients used to translate between RGB color space values and Y′CbCr values. The table of transfer function coefficients specifies the nonlinear function coefficients used to translate between the stored Y′CbCr values and a video capture or display system, as shown in Figure 3-2.

Matrix index

A 16-bit unsigned integer containing an index into a table specifying the transformation matrix coefficients used to translate between RGB color space values and Y′CbCr values. The table of matrixes specifies the matrix used during the translation, as shown in Figure 3-2.

The transfer function and matrix are used as shown in the following diagram.


Figure 3-2  Transfer between RGB and Y′CbCr color spaces

Transfer between RGB and Y′CbCr color spaces

The Y′CbCr values stored in a file are normalized to a range of [0,1]for Y′ and [-0.5, +0.5] for Cb and Cr when performing these operations. The normalized values are then scaled to the proper bit depth for a particular Y′CbCr format before storage in the file.


Figure 3-3  The normalized values are shown using the symbol E with a subscript for Y′, Cb, or Cr:

The normalized values are shown using the symbol E with a subscript for Y′, Cb, or Cr:

Note: The symbols used for these values are not intended to correspond to the use of these same symbols in other standards. In particular, "E" should not be interpreted as voltage.

These normalized values can be mapped onto the stored integer values of a particular compression type's Y′, Cb, and Cr components using two different schemes, which we will call Scheme A and Scheme B.

!

Warning:  Other, slightly different encoding/mapping schemes exist in the video industry, and data encoded using these schemes must be converted to one of the QuickTime schemes defined here.

Scheme A uses "Wide-Range" mapping (full scale) with unsigned Y′ and twos-complement Cb and Cr values.


Figure 3-4  Equations for stored Y′CbCr values of bit-depth of n in scheme A

Equations for stored Y′CbCr values of bit-depth of n in scheme A

This maps normalized values to stored values so that, for example, 8-bit unsigned values for Y′ go from 0-255 as the normalized value goes from 0 to 1, and 8-bit signed valued for Cb and Cr go from -127 to +127 as the normalized values go from -0.5 to +0.5.

!

Warning:  In specifications such as ITU-R BT.601-4, JFIF 1.02, and SPIFF (Rec. ITU-T T.84), the symbols Cb and Cr are used to describe offset binary integers, not twos-complement signed integers shown here.

Scheme B uses "Video-Range" mapping with unsigned Y′ and offset binary Cb and Cr values.

Note: Scheme B comes from digital video industry specifications such as Rec. ITU-R BT. 601-4. All standard digital video tape formats (e.g., SMPTE D-1, SMPTE D-5) and all standard digital video links (e.g., SMPTE 259M-1997 serial digital video) use this scheme. Professional video storage and processing equipment from vendors such as Abekas, Accom, and SGI also use this scheme. MPEG-2, DVC and many other codecs specify source Y′CbCr pixels using this scheme.


Figure 3-5  Equations for stored Y′CbCr values of bit-depth n in scheme B

Equations for stored Y′CbCr values of bit-depth n in scheme B

This maps the normalized values to stored values so that, for example, 8-bit unsigned values for Y′ go from 16–235 as the normalized value goes from 0 to1, and 8-bit unsigned valued for Cb and Cr go from 16–240 as the normalized values go from -0.5 to +0.5.

For 10-bit samples, Y′ has a range of 64 to 940 as the normalized value goes from 0 to 1, and Cb and Cr have the range of 65–960 as the normalized values go from –0.5 to +0.5.

Y′ is an unsigned integer. Cb and Cr are offset binary integers.

Certain Y′, Cb, and Cr component values v are reserved as synchronization signals and must not appear in a buffer. For n = 8 bits, these are values 0 and 255. For n = 10 bits, these are values 0, 1, 2, 3, 1020, 1021, 1022, and 1023. The writer of a QuickTime image is responsible for omitting these values. The reader of a QuickTime image may assume that they are not present.

The remaining component values that fall outside the mapping for scheme B (1-15 and 241-254 for n = 8 bits and 4–63 and 961–1019 for n = 10 bits) accommodate occasional filter undershoot and overshoot in image processing. In some applications, these values are used to carry other information (e.g., transparency). The writer of a QuickTime image may use these values and the reader of a QuickTime image must expect these values.

The following tables show the primary values, transfer functions, and matrixes indicated by the index entries in the 'colr' atom.

The R, G, and B values below are tristimulus values (such as candelas/meter^2), whose relationship to CIE XYZ values can be derived from the primaries and white point specified in the table, using the method described in SMPTE RP 177-1993. In this instance, the R, G, and B values are normalized to the range [0,1].

Table 3-4  Table of primaries, index and values

Index

Values

0

Reserved

1

Recommendation ITU-R BT.709-2, SMPTE 274M-1995, and SMPTE 296M-1997 white x = 0.3127 y = 0.3290 (CIE III. D65) red x=0.640 y = 0.330 green x = 0.300 y = 0.600 blue x = 0.150 y = 0.060

2

Primary values are unknown

3–4

Reserved

5

SMPTE RP 145-1993, SMPTE170M-1994, 293M-1996, 240M-1995, and SMPTE 274M-1995 white x = 0.3127 y = 0.3290 (CIE III. D65) red x = 0.64 y = 0.33 green x = 0.29 y = 0.60 blue x = 0.15 y = 0.06

6

ITU-R BT.709-2, SMPTE 274M-1995, and SMPTE 296M-1997 white x = 0.3127 y = 0.3290 (CIE III. D65) red x = 0.630 y = 0.340 green x = 0.310 y = 0.595 blue x = 0.155 y = 0.070

7–65535

Reserved

The transfer functions below are used as shown in Figure 3-2.

Table 3-5  Table of transfer function index and values

Index

Video Standards

0

Reserved

1

Recommendation ITU-R BT.709-2, SMPTE 274M-1995, 296M-1997, 293M-1996, 170M-1994 See below for transfer function equations.

2

Coefficient values are unknown

3–6

Reserved

7

Recommendation SMPTE 240M-1995 and 274M-1995 See below for transfer function equations.

8–65535

Reserved

The MPEG-2 sequence display extension transfer_characteristics defines a code 6 whose transfer function is identical to that in code 1. QuickTime writers should map 6 to 1 when converting from transfer_characteristics to transferFunction.

Recommendation ITU-R BT.470-4 specified an "assumed gamma value of the receiver for which the primary signals are pre-corrected" as 2.2 for NTSC and 2.8 for PAL systems. This information is both incomplete and obsolete. Modern 525- and 625-line digital and NTSC/PAL systems use the transfer function with code 1 below.


Figure 3-6  Equations for index code 1

Equations for index code 1


Figure 3-7  Equations for index code 7

Equations for index code 7

The matrix values are shown in Table 3-6 and in Figure 3-8, Figure 3-9, and Figure 3-10. These figures show a formula for obtaining the normalized value of Y′ in the range [0,1]. You can derive the formula for normalized values of Cb and Cr as follows:

If the equation for normalized Y′ has the form:

image: ../art/qtff_20.gif

Then the formulas for normalized Cb and Cr are:

image: ../art/qtff_21.gif

Table 3-6  Table of matrix index and values

Index

Video Standard

0

Reserved

1

Recommendation ITU-R BT.709-2 (1125/60/2:1 only), SMPTE 274M-1995, 296M-1997 See below for matrix values.

2

Coefficient values are unknown

3–5

Reserved

6

Recommendation ITU-R BT.601-4 and BT.470-4 System B and G, SMPTE 170M-1994, 293M-1996 See below for matrix values

7

SMPTE 240M-1995, 274M-1995 See below for matrix values

8–65535

Reserved


Figure 3-8  Matrix values for index code 1

Matrix values for index code 1


Figure 3-9  Matrix values for index code 6

Matrix values for index code 6


Figure 3-10  Matrix values for index code 7

Matrix values for index code 7

Clean Aperture ('clap')

The clean aperture extension defines the relationship between the pixels in a stored image and a canonical rectangular region of a video system from which it was captured or to which it will be displayed. This can be used to correlate pixel locations in two or more images—possibly recorded using different systems—for accurate compositing. This is necessary because different video digitizer devices can digitize different regions of the incoming video signal, causing pixel misalignment between images. In particular, a stored image may contain “edge” data outside the canonical display area for a given system.

The clean aperture is either coincident with the stored image or a subset of the stored image; if it is a subset, it may be centered on the stored image, or it may be offset positively or negatively from the stored image center.

The clean aperture extension contains a width in pixels, a height in picture lines, and a horizontal and vertical offset between the stored image center and a canonical image center for the given video system. The width is typically the width of the canonical clean aperture for a video system divided by the pixel aspect ratio of the stored data. The offsets also take into account any “overscan” in the stored image. The height and width must be positive values, but the offsets may be positive, negative, or zero.

These values are given as ratios of two 32-bit numbers, so that applications can calculate precise values with minimum roundoff error. For whole values, the value should be stored in the numerator field while the denominator field is set to 1.

Size

A 32-bit unsigned integer containing the size of the 'clap' atom.

Type

A 32-bit unsigned integer containing the four-character code 'clap'.

apertureWidth_N (numerator)

A 32-bit signed integer containing either the width of the clean aperture in pixels or the numerator portion of a fractional width.

apertureWidth_D (denominator)

A 32-bit signed integer containing either the denominator portion of a fractional width or the number 1.

apertureHeight_N (numerator)

A 32-bit signed integer containing either the height of the clean aperture in picture lines or the numerator portion of a fractional height.

apertureHeight_D (denominator)

A 32-bit signed integer containing either the denominator portion of a fractional height or the number 1.

horizOff_N (numerator)

A 32-bit signed integer containing either the horizontal offset of the clean aperture center minus (width–1)/2 or the numerator portion of a fractional offset. This value is typically zero.

horizOff_D (denominator)

A 32-bit signed integer containing either the denominator portion of the horizontal offset or the number 1.

vertOff_N (numerator)

A 32-bit signed integer containing either the vertical offset of the clean aperture center minus (height–1)/2 or the numerator portion of a fractional offset. This value is typically zero.

vertOff_D (denominator)

A 32-bit signed integer containing either the denominator portion of the vertical offset or the number 1.

Video Sample Data

The format of the data stored in video samples is completely dependent on the type of the compression used, as indicated in the video sample description. The following sections discuss some of the video encoding schemes supported by QuickTime.

Uncompressed RGB

Uncompressed RGB data is stored in a variety of different formats. The format used depends on the depth field of the video sample description. For all depths, the image data is padded on each scan line to ensure that each scan line begins on an even byte boundary.

RGB data can be stored in composite or planar format. Composite format stores the RGB data for each pixel contiguously, while planar format stores the R, G, and B data separately, so the RGB information for a given pixel is found using the same offset into multiple tables. For example, the data for two pixels could be represented in composite format as RGB-RGB or in planar format as RR-GG-BB.

Uncompressed Y′CbCr (including yuv2)

The Y′CbCr color space is widely used for digital video. In this data format, luminance is stored as a single value (Y), and chrominance information is stored as two color-difference components (Cb and Cr). Cb is the difference between the blue component and a reference value; Cr is the difference between the red component and a reference value.

This is commonly referred to as “YUV” format, with “U” standing-in for Cb and “V” standing-in for Cr. This usage is not strictly correct, as YUV, YIC, and Y′CbCr are distinct color models for PAL, NTSC, and digital video, but most Y′CbCr data formats and codecs are described or even named as some variant of “YUV.”

The values of Y, Cb, and Cr can be represented using a variety of bit depths, trading off accuracy for file size. Similarly, the chrominance values can be sub-sampled, recording only one pixel’s color value out of two, for example, or averaging the color value of adjacent pixels. This sub-sampling is a form of compression, but if no additional lossy compression is performed on the sampled video, it is still referred to as “uncompressed” Y′CbCr video. In addition, a fourth component can be added to Y′CbCr video to record an alpha channel.

The number of components (Y′CbCr with or without alpha) and any sub-sampling are denoted using ratios of three or four numbers, such as 4:2:2 to indicate 4 bits of Y to 2 bits each of Cb and Cr (chroma sub-sampling), or 4:4:4 for equal storage of Y, Cb, and Cr (no sub-sampling), or 4:4:4:4 for Y′CbCr plus alpha with no sub-sampling. The ratios do not typically denote actual bit depths.

Uncompressed Y′CbCr video data is typically stored as follows:

The yuv2 stream, for example, is encoded in a series of 4-byte packets. Each packet represents two adjacent pixels on the same scan line. The bytes within each packet are ordered as follows:

    y0 u y1 v

y0 is the luminance value for the left pixel; y1 the luminance for the right pixel. u and v are chromatic values that are shared by both pixels.

Accurate conversion between RGB and Y′CbCr color spaces requires a computation for each component of each pixel. An example conversion from yuv2 into RGB is represented by the following equations:

r = 1.402 * v + y + .5

g = y - .7143 * v - .3437 * u + .5

b = 1.77 * u + y + .5

The r, g, and b values range from 0 to 255.

The coefficients in these equations are derived from matrix operations and depend on the reference values used for the primary colors and for white. QuickTime uses canonical values for these reference coefficients based on published standards. The sample description extension for Y′CbCr formats includes a 'colr' atom, which contains indexes into a table of canonical references. This provides support for multiple video standards without opening the door to data entry errors for stored coefficient values. Refer to the published standards for the formulas and methods used to derive conversion coefficients from the table entries.

JPEG

QuickTime stores JPEG images according to the rules described in the ISO JPEG specification, document number DIS 10918-1.

MPEG-4 Video

MPEG-4 video uses the 'mp4v' data format. The sample description requires the elementary stream descriptor ('esds') extension to the standard video sample description. If non-square pixels are used, the pixel aspect ratio ('pasp') extension is also required. For details on these extensions, see “Pixel Aspect Ratio ('pasp')” and “MPEG-4 Elementary Stream Descriptor Atom ('esds').”

MPEG-4 video conforms to ISO/IEC documents 14496-1/2000(E) and 14496-2:1999/Amd.1:2000(E).

Motion-JPEG

Motion-JPEG (M-JPEG) is a variant of the ISO JPEG specification for use with digital video streams. Instead of compressing an entire image into a single bitstream, Motion-JPEG compresses each video field separately, returning the resulting JPEG bitstreams consecutively in a single frame.

There are two flavors of Motion-JPEG currently in use. These two formats differ based on their use of markers. Motion-JPEG format A supports markers; Motion-JPEG format B does not. The following paragraphs describe how QuickTime stores Motion-JPEG sample data. Figure 3-11 shows an example of Motion-JPEG A dual-field sample data. Figure 3-12 shows an example of Motion- JPEG B dual-field sample data.


Figure 3-11  Motion-JPEG A dual-field sample data

Motion-JPEG A dual-field sample data

Each field of Motion-JPEG format A fully complies with the ISO JPEG specification, and therefore supports application markers. QuickTime uses the APP1 marker to store control information, as follows (all of the fields are 32-bit integers):

Reserved

Unpredictable; should be set to 0.

Tag

Identifies the data type; this field must be set to 'mjpg'.

Field size

The actual size of the image data for this field, in bytes.

Padded field size

Contains the size of the image data, including pad bytes. Some video hardware may append pad bytes to the image data; this field, along with the field size field, allows you to compute how many pad bytes were added.

Offset to next field

The offset, in bytes, from the start of the field data to the start of the next field in the bitstream. This field should be set to 0 in the last field’s marker data.

Quantization table offset

The offset, in bytes, from the start of the field data to the quantization table marker. If this field is set to 0, check the image description for a default quantization table.

Huffman table offset

The offset, in bytes, from the start of the field data to the Huffman table marker. If this field is set to 0, check the image description for a default Huffman table.

Start of frame offset

The offset from the start of the field data to the start of image marker. This field should never be set to 0.

Start of scan offset

The offset, in bytes, from the start of the field data to the start of the scan marker. This field should never be set to 0.

Start of data offset

The offset, in bytes, from the start of the field data to the start of the data stream. Typically, this immediately follows the start of scan data.

Note: The last two fields have been added since the original Motion-JPEG specification, and so they may be missing from some Motion-JPEG A files. You should check the length of the APP1 marker before using the start of scan offset and start of data offset fields.

Motion-JPEG format B does not support markers. In place of the marker, therefore, QuickTime inserts a header at the beginning of the bitstream. Again, all of the fields are 32-bit integers.


Figure 3-12  Motion-JPEG B dual-field sample data

Motion-JPEG B dual-field sample data

Reserved

Unpredictable; should be set to 0.

Tag

The data type; this field must be set to 'mjpg'.

Field size

The actual size of the image data for this field, in bytes.

Padded field size

The size of the image data, including pad bytes. Some video hardware may append pad bytes to the image data; this field, along with the field size field, allows you to compute how many pad bytes were added.

Offset to next field

The offset, in bytes, from the start of the field data to the start of the next field in the bitstream. This field should be set to 0 in the second field’s header data.

Quantization table offset

The offset, in bytes, from the start of the field data to the quantization table. If this field is set to 0, check the image description for a default quantization table.

Huffman table offset

The offset, in bytes, from the start of the field data to the Huffman table. If this field is set to 0, check the image description for a default Huffman table.

Start of frame offset

The offset from the start of the field data to the field’s image data. This field should never be set to 0.

Start of scan offset

The offset, in bytes, from the start of the field data to the start of scan data.

Start of data offset

The offset, in bytes, from the start of the field data to the start of the data stream. Typically, this immediately follows the start of scan data.

Note: The last two fields were “reserved, must be set to zero” in the original Motion-JPEG specification.

The Motion-JPEG format B header must be a multiple of 16 in size. When you add pad bytes to the header, set them to 0.

Because Motion-JPEG format B does not support markers, the JPEG bitstream does not have null bytes (0x00) inserted after data bytes that are set to 0xFF.



< Previous PageNext Page > Hide TOC


Last updated: 2007-09-04




Did this document help you?
Yes: Tell us what works for you.

It’s good, but: Report typos, inaccuracies, and so forth.

It wasn’t helpful: Tell us what would have helped.
Get information on Apple products.
Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Copyright © 2007 Apple Inc.
All rights reserved. | Terms of use | Privacy Notice