Mapping Files Into Memory

File mapping is the process of mapping the disk sectors of a file into the virtual memory space of a process. Once mapped, your app accesses the file as if it were entirely resident in memory. As you read data from the mapped file pointer, the kernel pages in the appropriate data and returns it to your app.

Although mapping files can offer tremendous performance advantages, it’s not appropriate in all cases. The following sections explain when file mapping can help you and how you go about doing it in your code.

Choosing Whether to Map Files

The goal is to reduce transfers between disk and memory. File mapping can help you in some cases, but not all. The more of a file you map into memory, the less useful file mapping becomes.

Before you map any files into memory, make sure you understand your typical file usage patterns. Use instruments to help you identify where your application accesses files and how long those operations take.

When to Map Files

File mapping is effective when:

When randomly accessing a very large file, it’s often a better idea to map only a small portion of the file at a time. The problem with mapping large files is that the file consumes active memory. If the file is large enough, the system might be forced to page out other portions of memory to load your file. Mapping more than one file into memory just compounds this problem.

When Not to Map Files

Don’t use file mapping when:

If you map files on a removable or network drive and that drive is unmounted, or disappears for another reason, accessing the mapped memory can cause a bus error and crash your program.

File Mapping Example

Listing 1-1 demonstrates memory mapping using the BSD-level mmap and munmap functions. The mapped file occupies a system-determined portion of the application’s virtual address space until munmap is used to unmap the file. For more information about these functions, see mmap and munmap.

Listing 1-1  Mapping a file into virtual memory

void ProcessFile( char * inPathName )
{
    size_t dataLength;
    void * dataPtr;
 
    if( MapFile( inPathName, &dataPtr, &dataLength ) == 0 )
    {
        //
        // process the data and unmap the file
        //
 
        // . . .
 
        munmap( dataPtr, dataLength );
    }
}
 
 
// MapFile
// Return the contents of the specified file as a read-only pointer.
//
// Enter:inPathName is a UNIX-style “/”-delimited pathname
//
// Exit:    outDataPtra     pointer to the mapped memory region
//          outDataLength   size of the mapped memory region
//          return value    an errno value on error (see sys/errno.h)
//                          or zero for success
//
int MapFile( char * inPathName, void ** outDataPtr, size_t * outDataLength )
{
    int outError;
    int fileDescriptor;
    struct stat statInfo;
 
    // Return safe values on error.
    outError = 0;
    *outDataPtr = NULL;
    *outDataLength = 0;
 
    // Open the file.
    fileDescriptor = open( inPathName, O_RDONLY, 0 );
    if( fileDescriptor < 0 )
    {
       outError = errno;
    }
    else
    {
        // We now know the file exists. Retrieve the file size.
        if( fstat( fileDescriptor, &statInfo ) != 0 )
        {
            outError = errno;
        }
        else
        {
            // Map the file into a read-only memory region.
            *outDataPtr = mmap(NULL,
                                statInfo.st_size,
                                PROT_READ,
                                0,
                                fileDescriptor,
                                0);
            if( *outDataPtr == MAP_FAILED )
            {
                outError = errno;
            }
            else
            {
                // On success, return the size of the mapped file.
                *outDataLength = statInfo.st_size;
            }
        }
 
        // Now close the file. The kernel doesn’t use our file descriptor.
        close( fileDescriptor );
    }
 
    return outError;
}