Time and Media Representations

Time-based audio-visual data such as a movie file or a video stream is represented in the AV Foundation framework by AVAsset. Its structure dictates much of the framework works. Several low-level data structures that AV Foundation uses to represent time and media such as sample buffers come from the Core Media framework.

Representation of Assets

AVAsset is the core class in the AV Foundation framework. It provides a format-independent abstraction of time-based audiovisual data, such as a movie file or a video stream. In many cases, you work with one of its subclasses: you use the composition subclasses when you create new assets (see “Editing”), and you use AVURLAsset to create a new asset instance from media at a given URL (including assets from the MPMedia framework or the Asset Library framework—see “Using Assets”).

image: ../Art/avassetHierarchy.jpg

An asset contains a collection of tracks that are intended to be presented or processed together, each of a uniform media type, including (but not limited to) audio, video, text, closed captions, and subtitles. The asset object provides information about whole resource, such as its duration or title, as well as hints for presentation, such as its natural size. Assets may also have metadata, represented by instances of AVMetadataItem.

A track is represented by an instance of AVAssetTrack. In a typical simple case, one track represents the audio component and another represents the video component; in a complex composition, there may be multiple overlapping tracks of audio and video.

image: ../Art/avassetAndTracks.jpg

A track has a number of properties, such as its type (video or audio), visual and/or audible characteristics (as appropriate), metadata, and timeline (expressed in terms of its parent asset). A track also has an array of format descriptions. The array contains CMFormatDescriptions (see CMFormatDescriptionRef), each of which describes the format of media samples referenced by the track. A track that contains uniform media (for example, all encoded using to the same settings) will provide an array with a count of 1.

A track may itself be divided into segments, represented by instances of AVAssetTrackSegment. A segment is a time mapping from the source to the asset track timeline.

Representations of Time

Time in AV Foundation is represented by primitive structures from the Core Media framework.

CMTime Represents a Length of Time

CMTime is a C structure that represents time as a rational number, with a numerator (an int64_t value), and a denominator (an int32_t timescale).Conceptually, the timescale specifies the fraction of a second each unit in the numerator occupies. Thus if the timescale is 4, each unit represents a quarter of a second; if the timescale is 10, each unit represents a tenth of a second, and so on. You frequently use a timescale of 600, since this is a common multiple of several commonly-used frame-rates: 24 frames per second (fps) for film, 30 fps for NTSC (used for TV in North America and Japan), and 25 fps for PAL (used for TV in Europe). Using a timescale of 600, you can exactly represent any number of frames in these systems.

In addition to a simple time value, a CMTime can represent non-numeric values: +infinity, -infinity, and indefinite. It can also indicate whether the time been rounded at some point, and it maintains an epoch number.

Using CMTime

You create a time using CMTimeMake, or one of the related functions such as CMTimeMakeWithSeconds (which allows you to create a time using a float value and specify a preferred time scale). There are several functions for time-based arithmetic and to compare times, as illustrated in the following example.

CMTime time1 = CMTimeMake(200, 2); // 200 half-seconds
CMTime time2 = CMTimeMake(400, 4); // 400 quarter-seconds
 
// time1 and time2 both represent 100 seconds, but using different timescales.
if (CMTimeCompare(time1, time2) == 0) {
    NSLog(@"time1 and time2 are the same");
}
 
Float64 float64Seconds = 200.0 / 3;
CMTime time3 = CMTimeMakeWithSeconds(float64Seconds , 3); // 66.66... third-seconds
time3 = CMTimeMultiply(time3, 3);
// time3 now represents 200 seconds; next subtract time1 (100 seconds).
time3 = CMTimeSubtract(time3, time1);
CMTimeShow(time3);
 
if (CMTIME_COMPARE_INLINE(time2, ==, time3)) {
    NSLog(@"time2 and time3 are the same");
}

For a list of all the available functions, see CMTime Reference.

Special Values of CMTime

Core Media provides constants for special values: kCMTimeZero, kCMTimeInvalid, kCMTimePositiveInfinity, and kCMTimeNegativeInfinity. There are many ways, though in which a CMTime can, for example, represent a time that is invalid. If you need to test whether a CMTime is valid, or a non-numeric value, you should use an appropriate macro, such as CMTIME_IS_INVALID, CMTIME_IS_POSITIVE_INFINITY, or CMTIME_IS_INDEFINITE.

CMTime myTime = <#Get a CMTime#>;
if (CMTIME_IS_INVALID(myTime)) {
    // Perhaps treat this as an error; display a suitable alert to the user.
}

You should not compare the value of an arbitrary CMTime with kCMTimeInvalid.

Representing a CMTime as an Object

If you need to use CMTimes in annotations or Core Foundation containers, you can convert a CMTime to and from a CFDictionary (see CFDictionaryRef) using CMTimeCopyAsDictionary and CMTimeMakeFromDictionary respectively. You can also get a string representation of a CMTime using CMTimeCopyDescription.

Epochs

The epoch number of a CMTime is usually set to 0, but you can use it to distinguish unrelated timelines. For example, the epoch could be incremented each cycle through a presentation loop, to differentiate between time N in loop 0 from time N in loop 1.

CMTimeRange Represents a Time Range

CMTimeRange is a C structure that has a start time and duration, both expressed as CMTimes. A time range does not include the time that is the start time plus the duration.

You create a time range using CMTimeRangeMake or CMTimeRangeFromTimeToTime. There are constraints on the value of the CMTimes’ epochs:

  • CMTimeRanges cannot span different epochs.

  • The epoch in a CMTime that represents a timestamp may be non-zero, but you can only perform range operations (such as CMTimeRangeGetUnion) on ranges whose start fields have the same epoch.

  • The epoch in a CMTime that represents a duration should always be 0, and the value must be non-negative.

Working with Time Ranges

Core Media provides functions you can use to determine whether a time range contains a given time or other time range, or whether two time ranges are equal, and to calculate unions and intersections of time ranges, such as CMTimeRangeContainsTime, CMTimeRangeEqual, CMTimeRangeContainsTimeRange, and CMTimeRangeGetUnion.

Given that a time range does not include the time that is the start time plus the duration, the following expression always evaluates to false:

CMTimeRangeContainsTime(range, CMTimeRangeGetEnd(range))

For a list of all the available functions, see CMTimeRange Reference.

Special Values of CMTimeRange

Core Media provides constants for a zero-length range and an invalid range, kCMTimeRangeZero and kCMTimeRangeInvalid respectively. There are many ways, though in which a CMTimeRange can be invalid, or zero—or indefinite (if one of the CMTimes is indefinite. If you need to test whether a CMTimeRange is valid, zero, or indefinite, you should use an appropriate macro: CMTIMERANGE_IS_VALID, CMTIMERANGE_IS_INVALID, CMTIMERANGE_IS_EMPTY, or CMTIMERANGE_IS_EMPTY.

CMTimeRange myTimeRange = <#Get a CMTimeRange#>;
if (CMTIMERANGE_IS_EMPTY(myTimeRange)) {
    // The time range is zero.
}

You should not compare the value of an arbitrary CMTimeRange with kCMTimeRangeInvalid.

Representing a CMTimeRange as an Object

If you need to use CMTimeRanges in annotations or Core Foundation containers, you can convert a CMTimeRange to and from a CFDictionary (see CFDictionaryRef) using CMTimeRangeCopyAsDictionary and CMTimeRangeMakeFromDictionary respectively. You can also get a string representation of a CMTime using CMTimeRangeCopyDescription.

Representations of Media

Video data and its associated metadata is represented in AV Foundation by opaque objects from the Core Media framework. Core Media represents video data using CMSampleBuffer (see CMSampleBufferRef). CMSampleBuffer is a Core Foundation-style opaque type; an instance contains the sample buffer for a frame of video data as a Core Video pixel buffer (see CVPixelBufferRef). You access the pixel buffer from a sample buffer using CMSampleBufferGetImageBuffer:

CVPixelBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(<#A CMSampleBuffer#>);

From the pixel buffer, you can access the actual video data. For an example, see “Converting a CMSampleBuffer to a UIImage.”

In addition to the video data, you can retrieve a number of other aspects of the video frame:

Converting a CMSampleBuffer to a UIImage

The following function shows how you can convert a CMSampleBuffer to a UIImage object. You should consider your requirements carefully before using it. Performing the conversion is a comparatively expensive operation. It is appropriate to, for example, create a still image from a frame of video data taken every second or so. You should not use this as a means to manipulate every frame of video coming from a capture device in real time.

UIImage *imageFromSampleBuffer(CMSampleBufferRef sampleBuffer) {
 
    CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
    // Lock the base address of the pixel buffer.
    CVPixelBufferLockBaseAddress(imageBuffer,0);
 
    // Get the number of bytes per row for the pixel buffer.
    size_t bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer);
    // Get the pixel buffer width and height.
    size_t width = CVPixelBufferGetWidth(imageBuffer);
    size_t height = CVPixelBufferGetHeight(imageBuffer);
 
    // Create a device-dependent RGB color space.
    static CGColorSpaceRef colorSpace = NULL;
    if (colorSpace == NULL) {
        colorSpace = CGColorSpaceCreateDeviceRGB();
            if (colorSpace == NULL) {
            // Handle the error appropriately.
            return nil;
        }
    }
 
    // Get the base address of the pixel buffer.
    void *baseAddress = CVPixelBufferGetBaseAddress(imageBuffer);
    // Get the data size for contiguous planes of the pixel buffer.
    size_t bufferSize = CVPixelBufferGetDataSize(imageBuffer);
 
    // Create a Quartz direct-access data provider that uses data we supply.
    CGDataProviderRef dataProvider =
        CGDataProviderCreateWithData(NULL, baseAddress, bufferSize, NULL);
    // Create a bitmap image from data supplied by the data provider.
    CGImageRef cgImage =
        CGImageCreate(width, height, 8, 32, bytesPerRow,
                        colorSpace, kCGImageAlphaNoneSkipFirst | kCGBitmapByteOrder32Little,
                        dataProvider, NULL, true, kCGRenderingIntentDefault);
    CGDataProviderRelease(dataProvider);
 
    // Create and return an image object to represent the Quartz image.
    UIImage *image = [UIImage imageWithCGImage:cgImage];
    CGImageRelease(cgImage);
 
    CVPixelBufferUnlockBaseAddress(imageBuffer, 0);
 
    return image;
}

Did this document help you? Yes It's good, but... Not helpful...