CAF File Overview
This chapter provides background information important for understanding and using Apple’s Core Audio Format (CAF) files.
CAF File Advantages
Apple’s Core Audio Format is a flexible, state-of-the-art file format for storing and manipulating digital audio data. It is fully supported by Core Audio APIs in OS X v10.4 and later and in OS X v10.3 with QuickTime 7 or later. It is supported in iOS starting in iOS 5.0. CAF provides high performance and flexibility and is scalable to future ultra-high resolution audio recording, editing, and playback.
CAF files have several advantages over other standard audio file formats:
Unrestricted file size
Whereas AIFF, AIFF-C, and WAV files are limited in size to 4 gigabytes, which might represent as little as 15 minutes of audio, CAF files use 64-bit file offsets, eliminating practical limits. A standard CAF file can hold audio data with a playback duration of hundreds of years.
Safe and efficient recording
Applications writing AIFF and WAV files must either update the data header’s size field at the end of recording—which can result in an unusable file if recording is interrupted before the header is finalized—or they must update the size field after recording each packet of data, which is inefficient. With CAF files, in contrast, an application can append new audio data to the end of the file in a manner that allows it to determine the amount of data even if the size field in the header has not been finalized.
Support for many data formats
CAF files serve as wrappers for a wide variety of audio data formats. The flexibility of the CAF file structure and the many types of metadata that can be recorded enable CAF files to be used with practically any type of audio data. Furthermore, CAF files can store any number of audio channels.
Support for many types of auxiliary data
In addition to audio data, CAF files can store text annotations, markers, channel layouts, and many other types of information that can help in the interpretation, analysis, or editing of the audio.
Support for data dependencies
Certain metadata in CAF files is linked to the audio data by an edit count value. You can use this value to determine when metadata has a dependency on the audio data and, furthermore, when the audio data has changed since the metadata was written.
CAF File Structure
CAF files begin with a file header, which identifies the file type and the CAF version, followed by a series of chunks. A chunk consists of a header, which defines the type of the chunk and indicates the size of its data section, followed by the chunk data. The nature and format of the data is specific to each type of chunk.
The only two chunk types required for every CAF file are the Audio Data chunk (which, as you might have guessed, contains the audio data) and the Audio Description chunk, which specifies the audio data format.
The Audio Description chunk must be the first chunk following the file header. The Audio Data chunk can appear anywhere else in the file, unless the size of its data section has not been determined. In that case, the size field in the Audio Data chunk header is set to
-1 and the Audio Data chunk must come last in the file so that the end of the audio data chunk is the same as the end of the file. This placement allows you to determine the data section size when that information is not available in the size field.
Audio is stored in the Audio Data chunk as a sequential series of packets. An audio packet in a CAF file contains one or more frames of audio data.
CAF supports a wide range of other chunk types, which can be placed in any order in the file except first (reserved for the Audio Description chunk) or last (when the Audio Data chunk size field is set to -1). Some chunk types can be used more than once in a file. Some refer to—or are referred to by—chunks of other types.
Every chunk consists of a chunk header followed by a data section. Chunk headers contain two fields:
A four-character code indicating the chunk’s type
A number indicating the chunk size in bytes
The format of the data in a chunk depends on the chunk type. It consists of a series of sections, typically called fields. The format of the audio data depends on the data type. All of the other fields in a CAF file are in big-endian (network) byte order.
Packets, Frames, and Samples
In order to understand this specification, it is important to understand the definitions of the following four terms:
One number for one channel of digitized audio data.
A set of samples representing one sample for each channel. The samples in a frame are intended to be played together (that is, simultaneously). Note that this definition might be different from the use of the term “frame” by codecs, video files, and audio or video processing applications.
The smallest, indivisible block of data. For linear PCM (pulse-code modulated) data, each packet contains exactly one frame. For compressed audio data formats, the number of frames in a packet depends on the encoding. For example, a packet of AAC represents 1024 frames of PCM. In some formats, the number of frames per packet varies.
The number of complete frames of samples per second of noncompressed or decompressed data.
Types of Chunks
This section briefly introduces the types of chunks defined in the CAF specification. All CAF chunk types are fully described in “Core Audio Format Specification .”
Every CAF file must include the following chunks:
Audio Description chunk, which describes the audio data format for the file. This chunk must follow immediately after the CAF file header. See “Audio Description Chunk.”
Audio Data chunk, containing the audio data for the file. If the data chunk’s size isn’t known, it must be the final chunk in the file. If this chunk’s header specifies the size, the chunk can appear anywhere after the Audio Description chunk. See “Audio Data Chunk.”
If the audio packets vary in size, the file must have a Packet Table chunk, which records the size of each packet. See “Packet Table Chunk.”
There is one chunk that is required for all CAF files with more than two channels:
Channel Layout chunk, which describes the role of each channel in the file. This chunk is optional for one- and two-channel files. See “Channel Layout Chunk.”
Some chunks refer to data in other, supporting chunks:
There are two chunks that you can use to place markers in the data file. These chunks share data types, described in “Marker Data Types”:
There are two chunk types that store musical information:
Support For Editors
Two chunks contain data for use by audio editors:
Overview chunks contain samples of the data useful for displaying the audio at a particular resolution. A CAF file can have any number of these; one for each resolution to be displayed. See “Overview Chunk.”
Peak chunks list the peak amplitude in each channel and specify the frame in which that amplitude occurs. See “Peak Chunk.”
There are two chunk types that hold annotations to the data:
One chunk type can be used to uniquely identify the data:
The optional Unique Material Identifier (UMID) chunk provides a unique identifier for the audio data in a CAF file. There can be at most one UMID chunk in a file. See “Unique Material Identifier Chunk.”
You can define your own chunk type to extend the CAF file specification. There is a chunk type defined for this purpose:
The User-Defined chunk provides a universally unique ID (UUID) for a new chunk type. See “User-Defined Chunk.”
Many chunk types allow you to specify a larger chunk size than is currently needed for data in order to reserve additional space. There is also a special chunk you can use to reserve extra space in the CAF file as a whole:
The Free chunk contains no data, but reserves space that you can use later. See “Free Chunk.”