QuickTime VR Movie Structure

This chapter describes the format of the tracks that make up a QuickTime VR movie file. The information in this chapter, combined with the information in Chapter 7, QTVR Atom Containers, and the overview from Chapter 2, QuickTime VR Panoramas and Object Movies, will enable you to add to your application the ability to create QuickTime VR movies.

The term QuickTime VR file format is used in various sections throughout this chapter, even though it is a bit of a misnomer because the chapter does not describe the details of exactly where data is stored in a file nor does it describe the format of the standard QuickTime movie atoms. That information is available in the QuickTime API Reference, and in the QuickTime File Format (see bibliography). However, you don’t need to know that level of detail in order to create QuickTime VR movies, since QuickTime provides several routines for creating movies.

The chapter describes the file format supported by the QuickTime VR Manager. It is comprised of one major section: Elements of a QuickTime VR Movie, which discusses a number of topics that are important in understanding the basic structure of a panoramic movie:

Elements of a QuickTime VR Movie

A QuickTime VR movie is stored on disk in a format known as the QuickTime VR file format. Beginning in QuickTime VR 2.0, a QuickTime VR movie could contain one or more nodes. Each node is either a panorama or an object. In addition, a QuickTime VR movie could contain various types of hot spots, including links between any two types of nodes.

All QuickTime VR movies contain a single QTVR track, a special type of QuickTime track that maintains a list of the nodes in the movie. Each individual sample in a QTVR track contains general information and hot spot information for a particular node.

If a QuickTime VR movie contains any panoramic nodes, that movie also contains a single panorama track, and if it contains any object nodes, it also contains a single object track. The panorama and object tracks contain information specific to the panoramas or objects in the movie. The actual image data for both panoramas and objects is usually stored in standard QuickTime video tracks, hereafter referred to as image tracks. (An image track can also be any type of track that is capable of displaying an image, such as a QuickTime 3D track.) The individual frames in the image track for a panorama make up the diced frames of the original single panoramic image. The frames for the image track of an object represent the many different views of the object. Hot spot image data is stored in parallel video tracks for both panoramas and objects.

Single-Node Panoramic Movies

Figure 5-1 illustrates the basic structure of a single-node panoramic movie. As you can see, every panoramic movie contains at least three tracks: a QTVR track, a panorama track, and a panorama image track.

Figure 5-1  The structure of a single-node panoramic movie file
The structure of a single-node panoramic movie file

For a single-node panoramic movie, the QTVR track contains just one sample. There is a corresponding sample in the panorama track, whose time and duration are the same as the time and duration of the sample in the QTVR track. The time base of the movie is used to locate the proper video samples in the panorama image track. For a panoramic movie, the video sample for the first diced frame of a node’s panoramic image is located at the same time as the corresponding QTVR and panorama track samples. The total duration of all the video samples is the same as the duration of the corresponding QTVR sample and the panorama sample.

A panoramic movie can contain an optional hot spot image track and any number of standard QuickTime tracks. A panoramic movie can also contain panoramic image tracks with a lower resolution. The video samples in these low-resolution image tracks must be located at the same time and must have the same total duration as the QTVR track. Likewise, the video samples for a hot spot image track, if one exists, must be located at the same time and must have the same total duration as the QTVR track.

Single-Node Object Movies

Figure 5-2 illustrates the basic structure of a single-node object movie. As you can see, every object movie contains at least three tracks: a QTVR track, an object track, and an object image track.

Figure 5-2  The structure of a single-node object movie file
The structure of a single-node object movie file

For a single-node object movie, the QTVR track contains just one sample. There is a corresponding sample in the object track, whose time and duration are the same as the time and duration of the sample in the QTVR track. The time base of the movie is used to locate the proper video samples in the object image track.

For an object movie, the frame corresponding to the first row and column in the object image array is located at the same time as the corresponding QTVR and object track samples. The total duration of all the video samples is the same as the duration of the corresponding QTVR sample and the object sample.

In addition to these three required tracks, an object movie can also contain a hot spot image track and any number of standard QuickTime tracks (such as video, sound, and text tracks). A hot spot image track for an object is a QuickTime video track that contains images of colored regions delineating the hot spots; an image in the hot spot image track must be synchronized to match the appropriate image in the object image track. A hot spot image track should be 8 bits deep and can be compressed with any lossless compressor (including temporal compressors). This is also true of panoramas.

To play a time-based track with the object movie, you must synchronize the sample data of that track to the start and stop times of a view in the object image track. For example, to play a different sound with each view of an object, you might store a sound track in the movie file with each set of sound samples synchronized to play at the same time as the corresponding object’s view image. (This technique also works for video samples.) Another way to add sound or video is simply to play a sound or video track during the object’s view animation; to do this, you need to add an active track to the object that is equal in duration to the object’s row duration.

Multinode Movies

A multinode QuickTime VR movie can contain any number of object and panoramic nodes. Figure 5-3 illustrates the structure of a QuickTime VR movie that contains five nodes (in this case, three panoramic nodes and two object nodes).

Figure 5-3  The structure of a multinode movie file
The structure of a multinode movie file

QTVR Track

A QTVR track is a special type of QuickTime track that maintains a list of all the nodes in a movie. The media type for a QTVR track is 'qtvr'. All the media samples in a QTVR track share a common sample description. This sample description contains the VR world atom container. The track contains one media sample for each node in the movie. Each QuickTime VR media sample contains a node information atom container.

QuickTime VR Sample Description Structure

Whereas the QuickTime VR media sample is simply the node information itself, all sample descriptions are required by QuickTime to have a certain structure for the first several bytes. The structure for the QuickTime VR sample description is as follows:

typedef struct QTVRSampleDescription {
    UInt32                              size;
    UInt32                              type;
    UInt32                              reserved1;
    UInt16                              reserved2;
    UInt16                              dataRefIndex;
    UInt32                              data;
} QTVRSampleDescription, *QTVRSampleDescriptionPtr,
 **QTVRSampleDescriptionHandle;
size

The size, in bytes, of the sample description header structure, including the VR world atom container contained in the data field.

type

The sample description type. For QuickTime VR movies, this type should be 'qtvr'.

reserved1

Reserved. This field must be 0.

reserved2

Reserved. This field must be 0.

dataRefIndex

Reserved. This field must be 0.

data

The VR world atom container. The sample description structure is extended to hold this atom container.

Panorama Tracks

A movie’s panorama track is a track that contains information about the panoramic nodes in a scene. The media type of the panorama track is 'pano'. Each sample in a panorama track corresponds to a single panoramic node. This sample parallels the corresponding sample in the QTVR track. Panorama tracks do not have a sample description (although QuickTime requires that you specify a dummy sample description when you call AddMediaSample to add a sample to a panorama track). The sample itself contains an atom container that includes a panorama sample atom and other optional atoms.

Panorama Sample Atom Structure

A panorama sample atom has an atom type of kQTVRPanoSampleDataAtomType ('pdat'). It describes a single panorama, including track reference indexes of the scene and hot spot tracks and information about the default viewing angles and the source panoramic image.

The structure of a panorama sample atom is defined by the QTVRPanoSampleAtom data type:

typedef struct QTVRPanoSampleAtom {
    UInt16                              majorVersion;
    UInt16                              minorVersion;
    UInt32                              imageRefTrackIndex;
    UInt32                              hotSpotRefTrackIndex;
    Float32                             minPan;
    Float32                             maxPan;
    Float32                             minTilt;
    Float32                             maxTilt;
    Float32                             minFieldOfView;
    Float32                             maxFieldOfView;
    Float32                             defaultPan;
    Float32                             defaultTilt;
    Float32                             defaultFieldOfView;
    UInt32                              imageSizeX;
    UInt32                              imageSizeY;
    UInt16                              imageNumFramesX;
    UInt16                              imageNumFramesY;
    UInt32                              hotSpotSizeX;
    UInt32                              hotSpotSizeY;
    UInt16                              hotSpotNumFramesX;
    UInt16                              hotSpotNumFramesY;
    UInt32                              flags;
    OSType                              panoType;
    UInt32                              reserved2;
} QTVRPanoSampleAtom, *QTVRPanoSampleAtomPtr;
majorVersion

The major version number of the file format.

minorVersion

The minor version number of the file format.

imageRefTrackIndex

The index of the image track reference. This is the index returned by the AddTrackReference function when the image track is added as a reference to the panorama track. There can be more than one image track for a given panorama track and hence multiple references. (A panorama track might have multiple image tracks if the panoramas have different characteristics, which could occur if the panoramas were shot with different size camera lenses.) The value in this field is 0 if there is no corresponding image track.

hotSpotRefTrackIndex

The index of the hot spot track reference.

minPan

The minimum pan angle, in degrees. For a full panorama, the value of this field is usually 0.0.

maxPan

The maximum pan angle, in degrees. For a full panorama, the value of this field is 360.0.

minTilt

The minimum tilt angle, in degrees. For a high-FOV cylindrical panorama, a typical value for this field is –42.5.

maxTilt

The maximum tilt angle, in degrees. For a high-FOV cylindrical panorama, a typical value for this field is +42.5.

minFieldOfView

The minimum vertical field of view, in degrees. For a high-resolution panorama, a typical value for this field is 5.0. The value in this field is 0 for the default minimum field of view, which is 5 percent of the maximum field of view.

maxFieldOfView

The maximum vertical field of view, in degrees. For a high-resolution panorama, a typical value for this field is 85.0. The value in this field is 0 for the default maximum field of view, which is maxTiltminTilt.

defaultPan

The default pan angle, in degrees.

defaultTilt

The default tilt angle, in degrees.

defaultFieldOfView

The default vertical field of view, in degrees.

imageSizeX

The width, in pixels, of the panorama stored in the highest resolution image track.

imageSizeY

The height, in pixels, of the panorama stored in the highest resolution image track.

imageNumFramesX

The number of frames into which the panoramic image is diced horizontally. The width of each frame (which is imageSizeX/imageNumFramesX) should be divisible by 4.

imageNumFramesY

The number of frames into which the panoramic image is diced vertically. The height of each frame (which is imageSizeY/imageNumFramesY) should be divisible by 4.

hotSpotSizeX

The width, in pixels, of the panorama stored in the highest resolution hot spot image track.

hotSpotSizeY

The height, in pixels, of the panorama stored in the highest resolution hot spot image track.

hotSpotNumFramesX

The number of frames into which the panoramic image is diced horizontally for the hot spot image track.

hotSpotNumFramesY

The number of frames into which the panoramic image is diced vertically for the hot spot image track.

flags

A set of panorama flags. kQTVRPanoFlagHorizontal has been superseded by the panoType field. It is only used when the panoType field is nil to indicate a horizontally-oriented cylindrical panorama.

panoType

An OSType describing the type of panorama. Types supported are

  • kQTVRHorizontalCylinder

  • kQTVRVerticalCylinder

  • kQTVRCube

reserved2

Reserved. This field must be 0.

The minimum and maximum values in the panorama sample atom describe the physical limits of the panoramic cylindrical image. QuickTime VR allows you to set further constraints on what portion of the image a user can see by calling the QTVRSetConstraints routine. You can also preset image constraints by adding constraint atoms to the panorama sample atom container. The three constraint atom types are kQTVRPanConstraintAtomType, kQTVRTiltConstraintAtomType, and kQTVRFOVConstraintAtomType. Each of these atom types share a common structure defined by the QTVRAngleRangeAtom data type:

typedef struct QTVRAngleRangeAtom {
    Float32                             minimumAngle;
    Float32                             maximumAngle;
} QTVRAngleRangeAtom, *QTVRAngleRangeAtomPtr;
minimumAngle

The minimum angle in the range, in degrees.

maximumAngle

The maximum angle in the range, in degrees.

Panorama Image Track

The actual panoramic image for a panoramic node is contained in a panorama image track, which is a standard QuickTime video track. The track reference to this track is stored in the imageRefTrackIndex field of the panorama sample atom.

Previous versions of QuickTime VR required the original panoramic image to be rotated 90 degrees counterclockwise. This orientation was changed in QuickTime 5 to allow either rotated (the previous requirement) or non-rotated tiles (the preferred orientation).

The rotated image is diced into smaller frames, and each diced frame is then compressed and added to the video track as a video sample, as shown in Figure 5-4. Frames can be compressed using any spatial compressor; however, temporal compression is not allowed for panoramic image tracks.

Figure 5-4  Creating an image track for a panorama
Creating an image track for a panorama

QuickTime 5 does not require the original panoramic image to be rotated 90 degrees counterclockwise, as was the case in previous versions of QuickTime VR. The rotated image is still diced into smaller frames, and each diced frame is then compressed and added to the video track as a video sample, as shown in Figure 5-5.

Figure 5-5  Creating an image track for a panorama, with the image track oriented horizontally
Creating an image track for a panorama, with the image track  oriented horizontally

In QuickTime 5, a panorama sample atom (which contains information about a single panorama) contains the panoType field, which indicates whether the diced panoramic image is oriented horizontally or vertically.

Cylindrical Panoramas

The primary change to cylindrical panoramas in QuickTime VR is that the panorama, as stored in the image track of the movie, can be oriented horizontally. This means that the panorama does not need to be rotated 90 degrees counterclockwise, as required previously.

To indicate a horizontal orientation, the field in the VRPanoSampleAtom data structure formerly called reserved1 has been renamed panoType. Its type is OSType. The panoType for a horizontally oriented cylinder is kQTVRHorizontalCylinder (‘hcyl’), while a vertical cylinder is kQTVRVerticalCylinder (‘vcyl’). For compatibility with older QuickTime VR files, when the panoType field is nil, then a cylinder is assumed, with the low order bit of the flags field set to 1 to indicate if the cylinder is horizontal and 0 if the cylinder is vertical.

One consequence of reorienting the panorama horizontally is that, when the panorama is divided into separate tiles, the order of the samples in the file is now the reverse of what it was for vertical cylinders. Since vertical cylinders were rotated 90 degrees counterclockwise, the first tile added to the image track was the right-most tile in the panorama. For unrotated horizontal cylinders, the first tile added to the image track is the left-most tile in the panorama.

Cubic Panoramas

A new type of panorama was introduced in QuickTime 5: the cubic panorama. This panorama in its simplest form is represented by six faces of a cube, thus enabling the viewer to see all the way up and all the way down. The file format and the cubic rendering engine actually allow for more complicated representations, such as special types of cubes with elongated sides or cube faces made up of separate tiles. Atoms that describe the orientation of each face allow for these nonstandard representations. If these atoms are not present, then the simplest representation is assumed. The following describes this simplest representation: a cube with six square sides.

Tracks in a cubic movie are laid out as they are for cylindrical panoramas. This includes a QTVR track, a panorama track, and an image track. Optionally, there may also be a hot spot track and a fast-start preview track. The image, hot spot, and preview tracks are all standard QuickTime video tracks.

Image Tracks in Cubic Nodes

For a cubic node the image track contains six samples that correspond to the six square faces of the cube. The same applies to hot spot and preview tracks. The following diagram shows how the order of samples in the track corresponds to the orientation of the cube faces.

../art/qt217.gif
../art/qt218.gif

Note that by default the frames are oriented horizontally. However, arbitrary orientations (90 degrees clockwise, 90 degrees counterclockwise, upside down, and diamond shaped) can be used if specified with the 'cufa' atom. Still, the greatest rendering speed is used with horizontally oriented tiles.

Panorama Tracks in Cubic Nodes

The media sample for a panorama track contains the pano sample atom container. For cubes, some of the fields in the pano sample data atom have special values, which provide compatibility back to QuickTime VR 2.2. The cubic projection engine ignores these fields. They allow one to view cubic movies in older versions of QuickTime VR using the cylindrical engine, although the view will be somewhat incorrect, and the top and bottom faces will not be visible. The special values are shown in Table 5-1.

Table 5-1  Fields and their special values as represented in the pano sample data atom, providing backward compatibility to earlier versions of QuickTime VR

Field

Value

imageNumFramesX

4

imageNumFramesY

1

imageSizeX

frame width * 4

imageSizeY

frame height

minPan

0.0

maxPan

360.0

minTilt

-45.0

maxTilt

45.0

minFieldOfView

5.0

maxFieldOfView

90.0

flags

1

A 1 value in the flags field tells QuickTime VR that the frames are not rotated. QuickTime VR treats this as a four-frame horizontal cylinder. The panoType field (formerly reserved1) must be set to kQTVRCube ('cube') so that QuickTime can recognize this panorama as a cube.

Since certain viewing fields in the pano sample data atom are being used for backward compatibility, a new atom must be added to indicate the proper viewing parameters for the cubic image. This atom is the cubic view atom (atom type 'cuvw'). The data structure of the cubic view atom is as follows:

struct QTVRCubicViewAtom {
    Float32         minPan;
    Float32         maxPan;
    Float32         minTilt;
    Float32         maxTilt;
    Float32         minFieldOfView;
    Float32         maxFieldOfView;
 
    Float32         defaultPan;
    Float32         defaultTilt;
    Float32         defaultFieldOfView;
};
typedef struct QTVRCubicViewAtom    QTVRCubicViewAtom;

The fields are filled in as desired for the cubic image. This atom is ignored by older versions of QuickTime VR. Typical values for the min and max fields are shown in Table 5-2.

Table 5-2  Values for min and max fields

Field

Value

minPan

0.0

maxPan

360.0

minTilt

-90.0

maxTilt

90.0

minFieldOfView

5.0

maxFieldOfView

120.0

You add the cubic view atom to the pano sample atom container (after adding the pano sample data atom). Then use AddMediaSample to add the atom container to the panorama track.

Nonstandard Cubes

Although the default representation for a cubic panorama is that of six square faces of a cube, it is possible to depart from this standard representation. When doing so, a new atom must be added to the pano sample atom container. The atom type is 'cufa'. The atom is an array of data structures of type QTVRCubicFaceData. Each entry in the array describes one face of whatever polyhedron is being defined. QTVRCubicFaceData is defined as follows:

struct QTVRCubicFaceData {
    float   orientation[4];
    float   center[2];
    float   aspect; // set to 1
    float   skew; // set to 0
};
typedef struct QTVRCubicFaceData    QTVRCubicFaceData;

The following section discusses the mathematical explanation of these data structures.

Quaternions

Quaternions provide a representation for rotation in three dimensions that has well-behaved computational properties. This allows for the implementation of smooth and continuous interpolation of rotation.

A quaternion is defined using four floating point values [ w x y z ]. These are calculated from the combination of the three coordinates of the rotation axis and the rotation angle.

There are four different components to quaternions, which can be represented as

../art/iqt_vrquat01.gif

Those components are typically ordered in one of two ways: [w x y z] or [x y z w]. Apple follows the convention of the [w x y z] ordering.

A quaternion has four components which can be further separated into two subcomponents: scalar, which is the w part, and vector, which is the x, y, z part.

../art/iqt_vrquat02.gif

The vector part represents an axis of rotation in 3D.

Apple uses a right-handed coordinate system which has x pointing to the right, y pointing up, and z coming out of the page, as shown in Figure 5-6.

Figure 5-6  A reference coordinate system
A reference coordinate system

With your right hand, if you point your thumb in the direction of the axis, then the rotation will go around in the direction that your fingers are curling. In this case, if you are looking around the y axis, then the rotation will come around toward you. Around the x axis, it will come and rotate downward; then the z axis will rotate counterclockwise.

In order to use a quaternion, you can specify it with an axis of rotation and an angle. That is encoded into a quaterion in this way:

../art/iqt_vrquat04.gif

The cosine of the half-angle is the w component, and the sine of the half-angle of rotation scales the axis of rotation. This yields [w x y z].

In most cases, a “normalized” quaternion is used, where the squares of the components add up to one:

../art/iqt_vrquat05.gif

Quaternions are used to specify the location of each face in 3-space.

If you want one face to be straight in front of you, it would be in the standard position––not rotated at all. In that case, there is no rotation.

../art/iqt_vrquat06.gif

If you have an image in standard position, it is not rotated at all. The angle of rotation is 0. Half of that angle is still 0, and the cosine would be 1. That produces the w coordinate. Since the sine of 0 is 0, it doesn’t really matter what the axis of rotation is, because it is not being rotated. So you have [1 0 0 0] for the standard position.

Now if you rotate it 180 degrees about the y axis, this would yield the following:

Half of 180 is 90, so the cosine of 90 degrees is 0. And if you rotate around the y axis, the y axis would be specified as [0 1 0]. And the sine of half of 180 degrees would be the sine of 90 degrees, which is 1. So you multiply 1 by 0 1 0.

../art/iqt_vrquat07.gif

If you just rotate it by 90 degrees to the right about the y axis, you get the following:

../art/iqt_vrquat08.gif

You are actually rotating by a negative 90 degrees, because the y axis would normally rotate in a positive direction.

You could rotate around the positive y axis. For example, if you want to rotate to the right. You can rotate around the positive y axis by - 90 degrees, or rotate about the negative y axis by + 90 degrees. These are equivalent. It all depends on what your orientation is.

../art/iqt_vrquat09.gif

Note that you can multiply a quaternion by -1 and it would represent exactly the same orientation. This implies that quaternions are a redundant representation for 3D orientation. Similar kinds of redundant representations are used in 2D and 3D graphics––for example, 3 x 3 matrices in QuickTime and 4 x 4 matrices in 3D. Both of these matrices can be scaled by arbitrary nonzero numbers (not just -1), yet they still represent the same projective transformation.

For the top, you want to rotate 90 degrees about the x axis. The cosine of half of 90 degrees is equal to the square root of one-half. Now take the x axis which is [1 0 0] and multiply it all out and you get the following:

../art/iqt_vrquat11.gif

Similarly, the bottom is then

../art/iqt_vrquat12.gif

When you subdivide a face, you scale its quaternion from the normalized quaternion by a factor that is related to the amount of subdivision.

Think of a quaternion as a point on a sphere. A unit (normalized) quaternion is like a point on a unit sphere. A larger magnitude quaternion then represents a point on a larger sphere.

The radius of the sphere is the focal length of the tile (sub-face), normalized to its resolution. When you subdivide a face 2 x 2, it is like you are placing the tiles on a sphere of radius 2, that is, the images are twice as far away as they would be with 1 x 1 tiling.

You don’t merely scale the quaternion by the normalized focal length, because of the rule for transforming a vector by a quaternion.

../art/iqt_vrquat13.gif

where q* is the conjugate of the quaternion q. Since the magnitude of the quaternion is applied twice in this transformation, you need to scale a unit quaternion by

../art/iqt_vrquat20.gif

if you want it to scale a vector by S when applying the above transformation. Thus, the quaternion scale factor is the square root of the normalized focal length.

You normalize the focal length to the height of the tile, specifically

../art/iqt_vrquat15.gif

where f and H are measured in units of pixels. The normalized focal length ../art/iqt_vrquat21.gif is thus specified in units of image half-height. You do this in order to make the specification resolution-independent.

The normalized focal length is related to the vertical field of view in a simple way:

../art/iqt_vrquat16.gif

where VFOV is the vertical field of view, and cot ( . ) is the cotangent.

This yields remarkably simple values for the usual tiling schemes:

../art/iqt_vrquat17.gif

In summary, sub-tile orientation is specified with scaled quaternions, q, given by

../art/iqt_vrquat18.gif

where ../art/iqt_vrquat21.gif is the normalized focal length and ../art/iqt_vrquat19.gif is the normalized quaternion.

Examples for Common Cases

The following tables, Table 5-3, Table 5-4 and Table 5-5, illustrate some examples for common cases, showing values used to represent six square sides in 1 x1, 2 x 2, and 3 x 3 matrices.

Table 5-3 shows what values QuickTime VR uses for the default representation of six square sides.

Table 5-3  Values used for representing six square sides in a 1 x 1 matrix

Orientation (quaternion)

Center

Aspect

Skew

w

x

y

z

x

y

+1

0

0

0

0

0

1

0

# front

+.5

0

-.5

0

0

0

1

0

# right

0

0

1

0

0

0

1

0

# back

+.5

0

+.5

0

0

0

1

0

# left

+.5

+.5

0

0

0

0

1

0

# top

+.5

-.5

0

0

0

0

1

0

# bottom

Table 5-4  Values used for representing six square sides in a 2 x 2 matrix

Orientation (quaternion)

Center

Aspect

Skew

w

x

y

z

x

y

2

0

0

0

x2

y2

1

0

# front

1

0

-1

0

x2

y2

1

0

# right

0

0

2

0

x2

y2

1

0

# back

1

0

1

0

x2

y2

1

0

# left

1

1

0

0

x2

y2

1

0

# top

1

-1

0

0

x2

y2

1

0

# bottom

where {x2, y2} come from the set:

{ [ -1,-1 ], [ +1, -1 ], [ -1, +1 ], [ +1, -1 ]}

Table 5-5  Values used for representing six square sides in a 3 x 3 matrix

Orientation (quaternion)

Center

Aspect

Skew

w

x

y

z

x

y

3

0

0

0

x3

y3

1

0

# front

3/2

0

-3/2

0

x3

y3

1

0

# right

0

0

3

0

x3

y3

1

0

# back

3/2

0

3/2

0

x3

y3

1

0

# left

3/2

3/2

0

0

x3

y3

1

0

# top

3/2

-3/2

0

0

x3

y3

1

0

# bottom

where {x3, y3} come from the set:

{ [ -2, -2 ], [ 0, -2 ], [ +2, -2 ], [ -2, 0 ], [ 0, 0 ], [ +2, 0 ], [ -2, +2 ], [ 0, +2 ], [ +2, +2 ] }

Figure 5-7 clarifies the center values for 1 x 1, 2 x 2, and 3 x 3 subtiling schemes. These values are represented in a resolution-independent format. In particular, the co-ordinates for the center are in units of one-half of the image height, specifically (height - 1)/2, just as with the normalized focal length.

Figure 5-7  Normalized center coordinates for subtiles
Normalized center coordinates for subtiles

Hot Spot Image Tracks

When a panorama contains hot spots, the movie file contains a hot spot image track, a video track that contains a parallel panorama, with the hot spots designated by colored regions. Each diced frame of the hot spot panoramic image must be compressed with a lossless compressor (such as QuickTime’s graphics compressor). The dimensions of the hot spot panoramic image are usually the same as those of the image track’s panoramic image, but this is not required. The dimensions must, however, have the same aspect ratio as the image track’s panoramic image. A hot spot image track should be 8 bits deep.

Low-Resolution Image Tracks

It’s possible to store one or more low-resolution versions of a panoramic image in a movie file; those versions are called low-resolution image tracks. If there is not enough memory at runtime to use the normal image tr ack, QuickTime VR uses a lower resolution image track if one is available. A low-resolution image track contains diced frames just like the higher resolution track.

Track Reference Entry Structure

Since there are no fields in the pano sample data atom to indicate the presence of low-resolution image tracks, a separate sibling atom must be added to the panorama sample atom container. The track reference array atom contains an array of track reference entry structures that specify information about any low-resolution image tracks contained in a movie. Its atom type is kQTVRTrackRefArrayAtomType ('tref').

A track reference entry structure is defined by the QTVRTrackRefEntry data type:

typedef struct QTVRTrackRefEntry {
    UInt32                              trackRefType;
    UInt16                              trackResolution;
    UInt32                              trackRefIndex;
} QTVRTrackRefEntry;
trackRefType

The track reference type.

trackResolution

The track resolution.

trackRefIndex

The index of the track reference.

The number of entries in the track reference array atom is determined by dividing the size of the atom by sizeof (QTVRTrackRefEntry).

kQTVRPreviewTrackRes is a special value for the trackResolution field in the QTVRTrackRefEntry structure. This is used to indicate the presence of a special preview image track.

Object Tracks

A movie’s object track is a track that contains information about the object nodes in a scene. The media type of the object track is 'obje'. Each sample in an object track corresponds to a single object node in the scene. The samples of the object track contain information describing the object images stored in the object image track.

These object information samples parallel the corresponding node samples in the QTVR track and are equal in time and duration to a particular object node’s image samples in the object’s image track as well as the object node’s hot spot samples in the object’s hot spot track.

Object tracks do not have a sample description (although QuickTime requires that you specify a dummy sample description when you call AddMediaSample to add a sample to an object track). The sample itself is an atom container that contains a single object sample atom and other optional atoms.

Object Sample Atom Structure

An object sample atom describes a single object, including information about the default viewing angles and the view settings. The structure of an object sample atom is defined by the QTVRObjectSampleAtom data type:

typedef struct QTVRObjectSampleAtom {
    UInt16                              majorVersion;
    UInt16                              minorVersion;
    UInt16                              movieType;
    UInt16                              viewStateCount;
    UInt16                              defaultViewState;
    UInt16                              mouseDownViewState;
    UInt32                              viewDuration;
    UInt32                              columns;
    UInt32                              rows;
    Float32                             mouseMotionScale;
    Float32                             minPan;
    Float32                             maxPan;
    Float32                             defaultPan;
    Float32                             minTilt;
    Float32                             maxTilt;
    Float32                             defaultTilt;
    Float32                             minFieldOfView;
    Float32                             fieldOfView;
    Float32                             defaultFieldOfView;
    Float32                             defaultViewCenterH;
    Float32                             defaultViewCenterV;
    Float32                             viewRate;
    Float32                             frameRate;
    UInt32                              animationSettings;
    UInt32                              controlSettings;
} QTVRObjectSampleAtom, *QTVRObjectSampleAtomPtr;
majorVersion

The major version number of the file format.

minorVersion

The minor version number of the file format.

movieType

The movie controller type.

viewStateCount

The number of view states of the object. A view state selects an alternate set of images for an object’s views. The value of this field must be positive.

defaultViewState

The 1-based index of the default view state. The default view state image for a given view is displayed when the mouse button is not down.

mouseDownViewState

The 1-based index of the mouse-down view state. The mouse-down view state image for a given view is displayed while the user holds the mouse button down and the cursor is over an object movie.

viewDuration

The total movie duration of all image frames contained in an object’s view. In an object that uses a single frame to represent a view, the duration is the image track’s sample duration time.

columns

The number of columns in the object image array (that is, the number of horizontal positions or increments in the range defined by the minimum and maximum pan values). The value of this field must be positive.

rows

The number of rows in the object image array (that is, the number of vertical positions or increments in the range defined by the minimum and maximum tilt values). The value of this field must be positive.

mouseMotionScale

The mouse motion scale factor (that is, the number of degrees that an object is panned or tilted when the cursor is dragged the entire width of the VR movie image). The default value is 180.0.

minPan

The minimum pan angle, in degrees. The value of this field must be less than the value of the maxPan field.

maxPan

The maximum pan angle, in degrees. The value of this field must be greater than the value of the minPan field.

defaultPan

The default pan angle, in degrees. This is the pan angle used when the object is first displayed. The value of this field must be greater than or equal to the value of the minPan field and less than or equal to the value of the maxPan field.

minTilt

The minimum tilt angle, in degrees. The default value is +90.0. The value of this field must be less than the value of the maxTilt field.

maxTilt

The maximum tilt angle, in degrees. The default value is –90.0. The value of this field must be greater than the value of the minTilt field.

defaultTilt

The default tilt angle, in degrees. This is the tilt angle used when the object is first displayed. The value of this field must be greater than or equal to the value of the minTilt field and less than or equal to the value of the maxTilt field.

minFieldOfView

The minimum field of view to which the object can zoom. The valid range for this field is from 1 to the value of the fieldOfView field. The value of this field must be positive.

fieldOfView

The image field of view, in degrees, for the entire object. The value in this field must be greater than or equal to the value of the minFieldOfView field.

defaultFieldOfView

The default field of view for the object. This is the field of view used when the object is first displayed. The value in this field must be greater than or equal to the value of the minFieldOfView field and less than or equal to the value of the fieldOfView field.

defaultViewCenterH

The default horizontal view center.

defaultViewCenterV

The default vertical view center.

viewRate

The view rate (that is, the positive or negative rate at which the view animation in the object plays, if view animation is enabled). The value of this field must be from –100.0 through +100.0, inclusive.

frameRate

The frame rate (that is, the positive or negative rate at which the frame animation in a view plays, if frame animation is enabled). The value of this field must be from –100.0 through +100.0, inclusive.

animationSettings

A set of 32-bit flags that encode information about the animation settings of the object.

controlSettings

A set of 32-bit flags that encode information about the control settings of the object.

The movieType field of the object sample atom structure specifies an object controller type, that is, the user interface to be used to manipulate the object.

QuickTime VR supports the following controller types:

enum ObjectUITypes {
    kGrabberScrollerUI                          = 1,
    kOldJoyStickUI                              = 2,
    kJoystickUI                                 = 3,
    kGrabberUI                                  = 4,
    kAbsoluteUI                                 = 5
};
kGrabberScrollerUI

The default controller, which displays a hand for dragging and rotation arrows when the cursor is along the edges of the object window.

kOldJoyStickUI

A joystick controller, which displays a joystick-like interface for spinning the object. With this controller, the direction of panning is reversed from the direction of the grabber.

kJoystickUI

A joystick controller, which displays a joystick-like interface for spinning the object. With this controller, the direction of panning is consistent with the direction of the grabber.

kGrabberUI

A grabber-only interface, which displays a hand for dragging but does not display rotation arrows when the cursor is along the edges of the object window.

kAbsoluteUI

An absolute controller, which displays a finger for pointing. The absolute controller switches views based on a row-and-column grid mapped into the object window.

The animationSettings field of the object sample atom is a long integer that specifies a set of animation settings for an object node. Animation settings specify characteristics of the movie while it is playing. Use these constants to specify animation settings:

enum QTVRAnimationSettings {
    kQTVRObjectAnimateViewFramesOn              = (1 << 0),
    kQTVRObjectPalindromeViewFramesOn           = (1 << 1),
    kQTVRObjectStartFirstViewFrameOn            = (1 << 2),
    kQTVRObjectAnimateViewsOn                   = (1 << 3),
    kQTVRObjectPalindromeViewsOn                = (1 << 4),
    kQTVRObjectSyncViewToFrameRate              = (1 << 5),
    kQTVRObjectDontLoopViewFramesOn             = (1 << 6),
    kQTVRObjectPlayEveryViewFrameOn             = (1 << 7)
};
kQTVRObjectAnimateViewFramesOn

If this bit is set, play all frames in the current view state.

kQTVRObjectPalindromeViewFramesOn

If this bit is set, play a back-and-forth animation of the frames of the current view state.

kQTVRObjectStartFirstViewFrameOn

If this bit is set, play the frame animation starting with the first frame in the view (that is, at the view start time).

kQTVRObjectAnimateViewsOn

If this bit is set, play all views of the current object in the default row of views.

kQTVRObjectPalindromeViewsOn

If this bit is set, play a back-and-forth animation of all views of the current object in the default row of views.

kQTVRObjectSyncViewToFrameRate

If this bit is set, synchronize the view animation to the frame animation and use the same options as for frame animation.

kQTVRObjectDontLoopViewFramesOn

If this bit is set, stop playing the frame animation in the current view at the end.

kQTVRObjectPlayEveryViewFrameOn

If this bit is set, play every view frame regardless of play rate. The play rate is used to adjust the duration in which a frame appears but no frames are skipped so the rate is not exact.

The controlSettings field of the object sample atom is a long integer that specifies a set of control settings for an object node. Control settings specify whether the object can wrap during panning and tilting, as well as other features of the node. The control settings are specified using these bit flags:

enum QTVRControlSettings {
    kQTVRObjectWrapPanOn                        = (1 << 0),
    kQTVRObjectWrapTiltOn                       = (1 << 1),
    kQTVRObjectCanZoomOn                        = (1 << 2),
    kQTVRObjectReverseHControlOn                = (1 << 3),
    kQTVRObjectReverseVControlOn                = (1 << 4),
    kQTVRObjectSwapHVControlOn                  = (1 << 5),
    kQTVRObjectTranslationOn                    = (1 << 6)
};
kQTVRObjectWrapPanOn

If this bit is set, enable wrapping during panning. When this control setting is enabled, the user can wrap around from the current pan constraint maximum value to the pan constraint minimum value (or vice versa) using the mouse or arrow keys.

kQTVRObjectWrapTiltOn

If this bit is set, enable wrapping during tilting. When this control setting is enabled, the user can wrap around from the current tilt constraint maximum value to the tilt constraint minimum value (or vice versa) using the mouse or arrow keys.

kQTVRObjectCanZoomOn

If this bit is set, enable zooming. When this control setting is enabled, the user can change the current field of view using the zoom-in and zoom-out keys on the keyboard (or using the VR controller buttons).

kQTVRObjectReverseHControlOn

If this bit is set, reverse the direction of the horizontal control.

kQTVRObjectReverseVControlOn

If this bit is set, reverse the direction of the vertical control.

kQTVRObjectSwapHVControlOn

If this bit is set, exchange the horizontal and vertical controls.

kQTVRObjectTranslationOn

If this bit is set, enable translation. When this setting is enabled, the user can translate using the mouse when either the translate key is held down or the controller translation mode button is toggled on.

Track References for Object Tracks

The track references to an object’s image and hot spot tracks are not handled the same way as track references to panoramas. The track reference types are the same (kQTVRImageTrackRefType and kQTVRHotSpotTrackRefAtomType), but the location of the reference indexes is different. There is no entry in the object sample atom for the track reference indexes. Instead, separate atoms using the VRTrackRefEntry structure are stored as siblings to the object sample atom. The types of these atoms are kQTVRImageTrackRefAtomType and kQTVRHotSpotTrackRefAtomType. If either of these atoms is not present, then the reference index to the corresponding track is assumed to be 1.

The actual views of an object for an object node are contained in an object image track, which is usually a standard QuickTime video track. (An object image track can also be any type of track that is capable of displaying an image, such as a QuickTime 3D track.)

As described in Chapter 2, QuickTime VR Panoramas and Object Movies,, these views are often captured by moving a camera around the object in a defined pattern of pan and tilt angles. The views must then be ordered into an object image array, which is stored as a one-dimensional sequence of frames in the movie’s video track (see Figure 5-8).

Figure 5-8  The structure of an image track for an object
The structure of an image track for an object

For object movies containing frame animation, each animated view in the object image array consists of the animating frames. It is not necessary that each view in the object image array contain the same number of frames, but the view duration of all views in the object movie must be the same.

For object movies containing alternate view states, alternate view states are stored as separate object image arrays that immediately follow the preceding view state in the object image track. Each state does not need to contain the same number of frames. However, the total movie time of each view state in an object node must be the same.