CAF File Overview

This chapter provides background information important for understanding and using Apple’s Core Audio Format (CAF) files.

CAF File Advantages

Apple’s Core Audio Format is a flexible, state-of-the-art file format for storing and manipulating digital audio data. It is fully supported by Core Audio APIs in OS X v10.4 and later and in OS X v10.3 with QuickTime 7 or later. It is supported in iOS starting in iOS 5.0. CAF provides high performance and flexibility and is scalable to future ultra-high resolution audio recording, editing, and playback.

CAF files have several advantages over other standard audio file formats:

CAF File Structure

CAF files begin with a file header, which identifies the file type and the CAF version, followed by a series of chunks. A chunk consists of a header, which defines the type of the chunk and indicates the size of its data section, followed by the chunk data. The nature and format of the data is specific to each type of chunk.

The only two chunk types required for every CAF file are the Audio Data chunk (which, as you might have guessed, contains the audio data) and the Audio Description chunk, which specifies the audio data format.

The Audio Description chunk must be the first chunk following the file header. The Audio Data chunk can appear anywhere else in the file, unless the size of its data section has not been determined. In that case, the size field in the Audio Data chunk header is set to -1 and the Audio Data chunk must come last in the file so that the end of the audio data chunk is the same as the end of the file. This placement allows you to determine the data section size when that information is not available in the size field.

Audio is stored in the Audio Data chunk as a sequential series of packets. An audio packet in a CAF file contains one or more frames of audio data.

CAF supports a wide range of other chunk types, which can be placed in any order in the file except first (reserved for the Audio Description chunk) or last (when the Audio Data chunk size field is set to -1). Some chunk types can be used more than once in a file. Some refer to—or are referred to by—chunks of other types.

Chunk Structure

Every chunk consists of a chunk header followed by a data section. Chunk headers contain two fields:

  • A four-character code indicating the chunk’s type

  • A number indicating the chunk size in bytes

The format of the data in a chunk depends on the chunk type. It consists of a series of sections, typically called fields. The format of the audio data depends on the data type. All of the other fields in a CAF file are in big-endian (network) byte order.

Packets, Frames, and Samples

In order to understand this specification, it is important to understand the definitions of the following four terms:

  • Sample

    One number for one channel of digitized audio data.

  • Frame

    A set of samples representing one sample for each channel. The samples in a frame are intended to be played together (that is, simultaneously). Note that this definition might be different from the use of the term “frame” by codecs, video files, and audio or video processing applications.

  • Packet

    The smallest, indivisible block of data. For linear PCM (pulse-code modulated) data, each packet contains exactly one frame. For compressed audio data formats, the number of frames in a packet depends on the encoding. For example, a packet of AAC represents 1024 frames of PCM. In some formats, the number of frames per packet varies.

  • Sample rate

    The number of complete frames of samples per second of noncompressed or decompressed data.

Types of Chunks

This section briefly introduces the types of chunks defined in the CAF specification. All CAF chunk types are fully described in Core Audio Format Specification .

Required

Every CAF file must include the following chunks:

  • Audio Description chunk, which describes the audio data format for the file. This chunk must follow immediately after the CAF file header. See Audio Description Chunk.

  • Audio Data chunk, containing the audio data for the file. If the data chunk’s size isn’t known, it must be the final chunk in the file. If this chunk’s header specifies the size, the chunk can appear anywhere after the Audio Description chunk. See Audio Data Chunk.

  • If the audio packets vary in size, the file must have a Packet Table chunk, which records the size of each packet. See Packet Table Chunk.

Channel Layout

There is one chunk that is required for all CAF files with more than two channels:

  • Channel Layout chunk, which describes the role of each channel in the file. This chunk is optional for one- and two-channel files. See Channel Layout Chunk.

Supplementary Data

Some chunks refer to data in other, supporting chunks:

  • Some compressed audio data formats require additional codec-specific data in order to decode the audio data. If the audio format requires this data, the file must have a Magic Cookie chunk. See Magic Cookie Chunk.

  • Some chunks refer to text strings held in the Strings chunk. See Strings Chunk.

Markers

There are two chunks that you can use to place markers in the data file. These chunks share data types, described in Marker Data Types:

  • Marker chunks hold individual markers. See Marker Chunk.

  • Region chunks delineate segments of the audio data. See Region Chunk

Music Metadata

There are two chunk types that store musical information:

  • Instrument chunks describe aspects of the audio data needed when the audio is used by a sampler or played as an instrument. See Instrument Chunk.

  • MIDI chunks store all of the information in a standard MIDI file. See MIDI Chunk.

Support For Editors

Two chunks contain data for use by audio editors:

  • Overview chunks contain samples of the data useful for displaying the audio at a particular resolution. A CAF file can have any number of these; one for each resolution to be displayed. See Overview Chunk.

  • Peak chunks list the peak amplitude in each channel and specify the frame in which that amplitude occurs. See Peak Chunk.

Annotations

There are two chunk types that hold annotations to the data:

  • Edit Comments chunks hold time-stamped comments added when the data is edited. See Edit Comments Chunk.

  • The Information chunk contains text strings that provide information about the audio data, such as key signature, artist, and title. See Information Chunk.

Identifier

One chunk type can be used to uniquely identify the data:

  • The optional Unique Material Identifier (UMID) chunk provides a unique identifier for the audio data in a CAF file. There can be at most one UMID chunk in a file. See Unique Material Identifier Chunk.

Extending CAF

You can define your own chunk type to extend the CAF file specification. There is a chunk type defined for this purpose:

  • The User-Defined chunk provides a universally unique ID (UUID) for a new chunk type. See User-Defined Chunk.

Extra Space

Many chunk types allow you to specify a larger chunk size than is currently needed for data in order to reserve additional space. There is also a special chunk you can use to reserve extra space in the CAF file as a whole:

  • The Free chunk contains no data, but reserves space that you can use later. See Free Chunk.