Best Practices for Working with Texture Data

Textures add realism to OpenGL objects. They help objects defined by vertex data take on the material properties of real-world objects, such as wood, brick, metal, and fur. Texture data can originate from many sources, including images.

Many of the same techniques your application uses on vertex data can also be used to improve texture performance.

Figure 11-1  Textures add realism to a scene
Textures add realism to a scene

Textures start as pixel data that flows through an OpenGL program, as shown in Figure 11-2.

Figure 11-2  Texture data path
Texture data path

The precise route that texture data takes from your application to its final destination can impact the performance of your application. The purpose of this chapter is to provide techniques you can use to ensure optimal processing of texture data in your application. This chapter

Using Extensions to Improve Texture Performance

Without any optimizations, texture data flows through an OpenGL program as shown in Figure 11-3. Data from your application first goes to the OpenGL framework, which may make a copy of the data before handing it to the driver. If your data is not in a native format for the hardware (see Optimal Data Formats and Types), the driver may also make a copy of the data to convert it to a hardware-specific format for uploading to video memory. Video memory, in turn, can keep a copy of the data. Theoretically, there could be four copies of your texture data throughout the system.

Figure 11-3  Data copies in an OpenGL program
Data copies in an OpenGL program

Data flows at different rates through the system, as shown by the size of the arrows in Figure 11-3. The fastest data transfer happens between VRAM and the GPU. The slowest transfer occurs between the OpenGL driver and VRAM. Data moves between the application and the OpenGL framework, and between the framework and the driver at the same "medium" rate. Eliminating any of the data transfers, but the slowest one in particular, will improve application performance.

There are several extensions you can use to eliminate one or more data copies and control how texture data travels from your application to the GPU:

Here are some recommendations:

The sections that follow describe the extensions and show how to use them.

Pixel Buffer Objects

Pixel buffer objects are a core feature of OpenGL 2.1 and also available through the GL_ARB_pixel_buffer_object extension. The procedure for setting up a pixel buffer object is almost identical to that of vertex buffer objects.

Using Pixel Buffer Objects to Efficiently Load Textures

  1. Call the function glGenBuffers to create a new name for a buffer object.

    void glGenBuffers(sizei n, uint *buffers );

    n is the number of buffers you wish to create identifiers for.

    buffers specifies a pointer to memory to store the buffer names.

  2. Call the function glBindBuffer to bind an unused name to a buffer object. After this call, the newly created buffer object is initialized with a memory buffer of size zero and a default state. (For the default setting, see the OpenGL specification for ARB_vertex_buffer_object.)

    void glBindBuffer(GLenum target, GLuint buffer);

    target should be be set to GL_PIXEL_UNPACK_BUFFER to use the buffer as the source of pixel data.

    buffer specifies the unique name for the buffer object.

  3. Create and initialize the data store of the buffer object by calling the function glBufferData. Essentially, this call uploads your data to the GPU.

    void glBufferData(GLenum target, sizeiptr size,
                const GLvoid *data, GLenum usage);

    target must be set to GL_PIXEL_UNPACK_BUFFER.

    size specifies the size of the data store.

    *data points to the source data. If this is not NULL, the source data is copied to the data store of the buffer object. If NULL, the contents of the data store are undefined.

    usage is a constant that provides a hint as to how your application plans to use the data store. For more details on buffer hints, see Buffer Usage Hints

  4. Whenever you call glDrawPixels, glTexSubImage or similar functions that read pixel data from the application, those functions use the data in the bound pixel buffer object instead.

  5. To update the data in the buffer object, your application calls glMapBuffer. Mapping the buffer prevents the GPU from operating on the data, and gives your application a pointer to memory it can use to update the buffer.

    void *glMapBuffer(GLenum target, GLenum access);

    target must be set to PIXEL_UNPACK_BUFFER.

    access indicates the operations you plan to perform on the data. You can supply READ_ONLY, WRITE_ONLY, or READ_WRITE.

  6. Modify the texture data using the pointer provided by map buffer.

  7. When you have finished modifying the texture, call the function glUnmapBuffer. You should supplyPIXEL_UNPACK_BUFFER. Once the buffer is unmapped, your application can no longer access the buffer’s data through the pointer, and the buffer’s contents are uploaded again to the GPU.

Using Pixel Buffer Objects for Asynchronous Pixel Transfers

glReadPixels normally blocks until previous commands have completed, which includes the slow process of copying the pixel data to the application. However, if you call glReadPixels while a pixel buffer object is bound, the function returns immediately. It does not block until you actually map the pixel buffer object to read its content.

  1. Call the function glGenBuffers to create a new name for a buffer object.

    void glGenBuffers(sizei n, uint *buffers );

    n is the number of buffers you wish to create identifiers for.

    buffers specifies a pointer to memory to store the buffer names.

  2. Call the function glBindBuffer to bind an unused name to a buffer object. After this call, the newly created buffer object is initialized with a memory buffer of size zero and a default state. (For the default setting, see the OpenGL specification for ARB_vertex_buffer_object.)

    void glBindBuffer(GLenum target, GLuint buffer);

    target should be be set to GL_PIXEL_PACK_BUFFER to use the buffer as the destination for pixel data.

    buffer specifies the unique name for the buffer object.

  3. Create and initialize the data store of the buffer object by calling the function glBufferData.

    void glBufferData(GLenum target, sizeiptr size,
                const GLvoid *data, GLenum usage);

    target must be set to GL_PIXEL_PACK_BUFFER.

    size specifies the size of the data store.

    *data points to the source data. If this is not NULL, the source data is copied to the data store of the buffer object. If NULL, the contents of the data store are undefined.

    usage is a constant that provides a hint as to how your application plans to use the data store. For more details on buffer hints, see Buffer Usage Hints

  4. Call glReadPixels or a similar function. The function inserts a command to read the pixel data into the bound pixel buffer object and then returns.

  5. To take advantage of asynchronous pixel reads, your application should perform other work.

  6. To retrieve the data in the pixel buffer object, your application calls glMapBuffer. This blocks OpenGL until the previously queued glReadPixels command completes, maps the data, and provides a pointer to your application.

    void *glMapBuffer(GLenum target, GLenum access);

    target must be set to GL_PIXEL_PACK_BUFFER.

    access indicates the operations you plan to perform on the data. You can supply READ_ONLY, WRITE_ONLY, or READ_WRITE.

  7. Write vertex data to the pointer provided by map buffer.

  8. When you no longer need the vertex data, call the function glUnmapBuffer. You should supply GL_PIXEL_PACK_BUFFER. Once the buffer is unmapped, the data is no longer accessible to your application.

Using Pixel Buffer Objects to Keep Data on the GPU

There is no difference between a vertex buffer object and a pixel buffer object except for the target to which they are bound. An application can take the results in one buffer and use them as another buffer type. For example, you could use the pixel results from a fragment shader and reinterpret them as vertex data in a future pass, without ever leaving the GPU:

  1. Set up your first pass and submit your drawing commands.

  2. Bind a pixel buffer object and call glReadPixels to fetch the intermediate results into a buffer.

  3. Bind the same buffer as a vertex buffer.

  4. Set up the second pass of your algorithm and submit your drawing commands.

Keeping your intermediate data inside the GPU when performing multiple passes can result in great performance increases.

Apple Client Storage

The Apple client storage extension (APPLE_client_storage) lets you provide OpenGL with a pointer to memory that your application allocates and maintains. OpenGL retains a pointer to your data but does not copy the data. Because OpenGL references your data, your application must retain its copy of the data until all referencing textures are deleted. By using this extension you can eliminate the OpenGL framework copy as shown in Figure 11-4. Note that a texture width must be a multiple of 32 bytes for OpenGL to bypass the copy operation from the application to the OpenGL framework.

Figure 11-4  The client storage extension eliminates a data copy
The client storage extension eliminates a data copy

The Apple client storage extension defines a pixel storage parameter, GL_UNPACK_CLIENT_STORAGE_APPLE, that you pass to the OpenGL function glPixelStorei to specify that your application retains storage for textures. The following code sets up client storage:

glPixelStorei(GL_UNPACK_CLIENT_STORAGE_APPLE, GL_TRUE);

For detailed information, see the OpenGL specification for the Apple client storage extension.

Apple Texture Range and Rectangle Texture

The Apple texture range extension (APPLE_texture_range) lets you define a region of memory used for texture data. Typically you specify an address range that encompasses the storage for a set of textures. This allows the OpenGL driver to optimize memory usage by creating a single memory mapping for all of the textures. You can also provide a hint as to how the data should be stored: cached or shared. The cached hint specifies to cache texture data in video memory. This hint is recommended when you have textures that you plan to use multiple times or that use linear filtering. The shared hint indicates that data should be mapped into a region of memory that enables the GPU to access the texture data directly (via DMA) without the need to copy it. This hint is best when you are using large images only once, perform nearest-neighbor filtering, or need to scale down the size of an image.

The texture range extension defines the following routine for making a single memory mapping for all of the textures used by your application:

void glTextureRangeAPPLE(GLenum target, GLsizei length, GLvoid *pointer);

target is a valid texture target, such as GL_TEXTURE_2D.

length specifies the number of bytes in the address space referred to by the pointer parameter.

*pointer points to the address space that your application provides for texture storage.

You provide the hint parameter and a parameter value to to the OpenGL function glTexParameteri. The possible values for the storage hint parameter (GL_TEXTURE_STORAGE_HINT_APPLE) are GL_STORAGE_CACHED_APPLE or GL_STORAGE_SHARED_APPLE.

Some hardware requires texture dimensions to be a power-of-two before the hardware can upload the data using DMA. The rectangle texture extension (ARB_texture_rectangle) was introduced to allow texture targets for textures of any dimensions—that is, rectangle textures (GL_TEXTURE_RECTANGLE_ARB). You need to use the rectangle texture extension together with the Apple texture range extension to ensure OpenGL uses DMA to access your texture data. These extensions allow you to bypass the OpenGL driver, as shown in Figure 11-5.

Note that OpenGL does not use DMA for a power-of-two texture target (GL_TEXTURE_2D). So, unlike the rectangular texture, the power-of-two texture will incur one additional copy and performance won't be quite as fast. The performance typically isn't an issue because games, which are the applications most likely to use power-of-two textures, load textures at the start of a game or level and don't upload textures in real time as often as applications that use rectangular textures, which usually play video or display images.

The next section has code examples that use the texture range and rectangle textures together with the Apple client storage extension.

Figure 11-5  The texture range extension eliminates a data copy
The texture range extension eliminates a data copy

For detailed information on these extensions, see the OpenGL specification for the Apple texture range extension and the OpenGL specification for the ARB texture rectangle extension.

Combining Client Storage with Texture Ranges

You can use the Apple client storage extension along with the Apple texture range extension to streamline the texture data path in your application. When used together, OpenGL moves texture data directly into video memory, as shown in Figure 11-6. The GPU directly accesses your data (via DMA). The set up is slightly different for rectangular and power-of-two textures. The code examples in this section upload textures to the GPU. You can also use these extensions to download textures, see Downloading Texture Data.

Figure 11-6  Combining extensions to eliminate data copies
Combining extensions to eliminate two data copies

Listing 11-1 shows how to use the extensions for a rectangular texture. After enabling the texture rectangle extension you need to bind the rectangular texture to a target. Next, set up the storage hint. Call glPixelStorei to set up the Apple client storage extension. Finally, call the function glTexImage2D with a with a rectangular texture target and a pointer to your texture data.

Listing 11-1  Using texture extensions for a rectangular texture

glEnable (GL_TEXTURE_RECTANGLE_ARB);
glBindTexture(GL_TEXTURE_RECTANGLE_ARB, id);
glTexParameteri(GL_TEXTURE_RECTANGLE_ARB,
        GL_TEXTURE_STORAGE_HINT_APPLE,
        GL_STORAGE_CACHED_APPLE);
glPixelStorei(GL_UNPACK_CLIENT_STORAGE_APPLE, GL_TRUE);
glTexImage2D(GL_TEXTURE_RECTANGLE_ARB,
            0, GL_RGBA, sizex, sizey, 0, GL_BGRA,
            GL_UNSIGNED_INT_8_8_8_8_REV,
            myImagePtr);

Setting up a power-of-two texture to use these extensions is similar to what's needed to set up a rectangular texture, as you can see by looking at Listing 11-2. The difference is that the GL_TEXTURE_2D texture target replaces the GL_TEXTURE_RECTANGLE_ARB texture target.

Listing 11-2  Using texture extensions for a power-of-two texture

glBindTexture(GL_TEXTURE_2D, myTextureName);
 
glTexParameteri(GL_TEXTURE_2D,
        GL_TEXTURE_STORAGE_HINT_APPLE,
        GL_STORAGE_CACHED_APPLE);
 
glPixelStorei(GL_UNPACK_CLIENT_STORAGE_APPLE, GL_TRUE);
 
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA,
            sizex, sizey, 0, GL_BGRA,
            GL_UNSIGNED_INT_8_8_8_8_REV, myImagePtr);

Optimal Data Formats and Types

The best format and data type combinations to use for texture data are:

The combination GL_RGBA and GL_UNSIGNED_BYTE needs to be swizzled by many cards when the data is loaded, so it's not recommended.

Working with Non–Power-of-Two Textures

OpenGL is often used to process video and images, which typically have dimensions that are not a power-of-two. Until OpenGL 2.0, the texture rectangle extension (ARB_texture_rectangle) provided the only option for a rectangular texture target. This extension, however, imposes the following restrictions on rectangular textures:

OpenGL 2.0 adds another option for a rectangular texture target through the ARB_texture_non_power_of_two extension, which supports these textures without the limitations of the ARB_texture_rectangle extension. Before using it, you must check to make sure the functionality is available. You'll also want to consult the OpenGL specification for the non—power-of-two extension.

Figure 11-7  Normalized and non-normalized coordinates
Normalized and non-normalized coordinates

If your code runs on a system that does not support either the ARB_texture_rectangle or ARB_texture_non_power_of_two extensions you have these options for working with with rectangular images:

Figure 11-8  An image segmented into power-of-two tiles
An image segmented into power-of-two tiles

Creating Textures from Image Data

OpenGL on the Macintosh provides several options for creating high-quality textures from image data. OS X supports floating-point pixel values, multiple image file formats, and a variety of color spaces. You can import a floating-point image into a floating-point texture. Figure 11-9 shows an image used to texture a cube.

Figure 11-9  Using an image as a texture for a cube
Using an image as a texture for a cube

For Cocoa, you need to provide a bitmap representation. You can create an NSBitmapImageRep object from the contents of an NSView object. You can use the Image I/O framework (see CGImageSource Reference). This framework has support for many different file formats, floating-point data, and a variety of color spaces. Furthermore, it is easy to use. You can import image data as a texture simply by supplying a CFURL object that specifies the location of the texture. There is no need for you to convert the image to an intermediate integer RGB format.

Creating a Texture from a Cocoa View

You can use the NSView class or a subclass of it for texturing in OpenGL. The process is to first store the image data from an NSView object in an NSBitmapImageRep object so that the image data is in a format that can be readily used as texture data by OpenGL. Then, after setting up the texture target, you supply the bitmap data to the OpenGL function glTexImage2D. Note that you must have a valid, current OpenGL context set up.

Listing 11-3 shows a routine that uses this process to create a texture from the contents of an NSView object. A detailed explanation for each numbered line of code appears following the listing.

Listing 11-3  Building an OpenGL texture from an NSView object

-(void)myTextureFromView:(NSView*)theView
                textureName:(GLuint*)texName
{
    NSBitmapImageRep * bitmap =  [theView bitmapImageRepForCachingDisplayInRect:
            [theView visibleRect]]; // 1
    int samplesPerPixel = 0;
 
    [theView cacheDisplayInRect:[theView visibleRect] toBitmapImageRep:bitmap]; // 2
    samplesPerPixel = [bitmap samplesPerPixel]; // 3
    glPixelStorei(GL_UNPACK_ROW_LENGTH, [bitmap bytesPerRow]/samplesPerPixel); // 4
    glPixelStorei (GL_UNPACK_ALIGNMENT, 1); // 5
    if (*texName == 0) // 6
            glGenTextures (1, texName);
    glBindTexture (GL_TEXTURE_RECTANGLE_ARB, *texName); // 7
    glTexParameteri(GL_TEXTURE_RECTANGLE_ARB,
                    GL_TEXTURE_MIN_FILTER, GL_LINEAR); // 8
 
   if(![bitmap isPlanar] &&
       (samplesPerPixel == 3 || samplesPerPixel == 4)) { // 9
        glTexImage2D(GL_TEXTURE_RECTANGLE_ARB,
                     0,
                     samplesPerPixel == 4 ? GL_RGBA8 : GL_RGB8,
                     [bitmap pixelsWide],
                     [bitmap pixelsHigh],
                     0,
                     samplesPerPixel == 4 ? GL_RGBA : GL_RGB,
                     GL_UNSIGNED_BYTE,
                    [bitmap bitmapData]);
    } else {
       // Your code to report unsupported bitmap data
    }
}

Here's what the code does:

  1. Allocates an NSBitmapImageRep object.

  2. Initializes the NSBitmapImageRep object with bitmap data from the current view.

  3. Gets the number of samples per pixel.

  4. Sets the appropriate unpacking row length for the bitmap.

  5. Sets the byte-aligned unpacking that's needed for bitmaps that are 3 bytes per pixel.

  6. If a texture object is not passed in, generates a new texture object.

  7. Binds the texture name to the texture target.

  8. Sets filtering so that it does not use a mipmap, which would be redundant for the texture rectangle extension.

  9. Checks to see if the bitmap is nonplanar and is either a 24-bit RGB bitmap or a 32-bit RGBA bitmap. If so, retrieves the pixel data using the bitmapData method, passing it along with other appropriate parameters to the OpenGL function for specifying a 2D texture image.

Creating a Texture from a Quartz Image Source

Quartz images (CGImageRef data type) are defined in the Core Graphics framework (ApplicationServices/CoreGraphics.framework/CGImage.h) while the image source data type for reading image data and creating Quartz images from an image source is declared in the Image I/O framework (ApplicationServices/ImageIO.framework/CGImageSource.h). Quartz provides routines that read a wide variety of image data.

To use a Quartz image as a texture source, follow these steps:

  1. Create a Quartz image source by supplying a CFURL object to the function CGImageSourceCreateWithURL.

  2. Create a Quartz image by extracting an image from the image source, using the function CGImageSourceCreateImageAtIndex.

  3. Extract the image dimensions using the function CGImageGetWidth and CGImageGetHeight. You'll need these to calculate the storage required for the texture.

  4. Allocate storage for the texture.

  5. Create a color space for the image data.

  6. Create a Quartz bitmap graphics context for drawing. Make sure to set up the context for pre-multiplied alpha.

  7. Draw the image to the bitmap context.

  8. Release the bitmap context.

  9. Set the pixel storage mode by calling the function glPixelStorei.

  10. Create and bind the texture.

  11. Set up the appropriate texture parameters.

  12. Call glTexImage2D, supplying the image data.

  13. Free the image data.

Listing 11-4 shows a code fragment that performs these steps. Note that you must have a valid, current OpenGL context.

Listing 11-4  Using a Quartz image as a texture source

CGImageSourceRef myImageSourceRef = CGImageSourceCreateWithURL(url, NULL);
CGImageRef myImageRef = CGImageSourceCreateImageAtIndex (myImageSourceRef, 0, NULL);
GLint myTextureName;
size_t width = CGImageGetWidth(myImageRef);
size_t height = CGImageGetHeight(myImageRef);
CGRect rect = {{0, 0}, {width, height}};
void * myData = calloc(width * 4, height);
CGColorSpaceRef space = CGColorSpaceCreateDeviceRGB();
CGContextRef myBitmapContext = CGBitmapContextCreate (myData,
                        width, height, 8,
                        width*4, space,
                        kCGBitmapByteOrder32Host |
                          kCGImageAlphaPremultipliedFirst);
CGContextSetBlendMode(myBitmapContext, kCGBlendModeCopy);
CGContextDrawImage(myBitmapContext, rect, myImageRef);
CGContextRelease(myBitmapContext);
glPixelStorei(GL_UNPACK_ROW_LENGTH, width);
glPixelStorei(GL_UNPACK_ALIGNMENT, 1);
glGenTextures(1, &myTextureName);
glBindTexture(GL_TEXTURE_RECTANGLE_ARB, myTextureName);
glTexParameteri(GL_TEXTURE_RECTANGLE_ARB,
                    GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexImage2D(GL_TEXTURE_RECTANGLE_ARB, 0, GL_RGBA8, width, height,
                    0, GL_BGRA_EXT, GL_UNSIGNED_INT_8_8_8_8_REV, myData);
free(myData);

For more information on using Quartz, see Quartz 2D Programming Guide, CGImage Reference, and CGImageSource Reference.

Getting Decompressed Raw Pixel Data from a Source Image

You can use the Image I/O framework together with a Quartz data provider to obtain decompressed raw pixel data from a source image, as shown in Listing 11-5. You can then use the pixel data for your OpenGL texture. The data has the same format as the source image, so you need to make sure that you use a source image that has the layout you need.

Alpha is not premultiplied for the pixel data obtained in Listing 11-5, but alpha is premultiplied for the pixel data you get when using the code described in Creating a Texture from a Cocoa View and Creating a Texture from a Quartz Image Source.

Listing 11-5  Getting pixel data from a source image

CGImageSourceRef myImageSourceRef = CGImageSourceCreateWithURL(url, NULL);
CGImageRef myImageRef = CGImageSourceCreateImageAtIndex (myImageSourceRef, 0, NULL);
CFDataRef data = CGDataProviderCopyData(CGImageGetDataProvider(myImageRef));
void *pixelData = CFDataGetBytePtr(data);

Downloading Texture Data

A texture download operation uses the same data path as an upload operation except that the data path is reversed. Downloading transfers texture data, using direct memory access (DMA), from VRAM into a texture that can then be accessed directly by your application. You can use the Apple client range, texture range, and texture rectangle extensions for downloading, just as you would for uploading.

To download texture data using the Apple client storage, texture range, and texture rectangle extensions:

Listing 11-6 shows a code fragment that downloads a rectangular texture that uses cached memory. Your application processes data between the glCopyTexSubImage2D and glGetTexImage calls. How much processing? Enough so that your application does not need to wait for the GPU.

Listing 11-6  Code that downloads texture data

glBindTexture(GL_TEXTURE_RECTANGLE_ARB, myTextureName);
glTexParameteri(GL_TEXTURE_RECTANGLE_ARB, GL_TEXTURE_STORAGE_HINT_APPLE,
                GL_STORAGE_SHARED_APPLE);
glPixelStorei(GL_UNPACK_CLIENT_STORAGE_APPLE, GL_TRUE);
glTexImage2D(GL_TEXTURE_RECTANGLE_ARB, 0, GL_RGBA,
                sizex, sizey, 0, GL_BGRA,
                GL_UNSIGNED_INT_8_8_8_8_REV, myImagePtr);
 
glCopyTexSubImage2D(GL_TEXTURE_RECTANGLE_ARB,
                0, 0, 0, 0, 0, image_width, image_height);
glFlush();
// Do other work processing here, using a double or triple buffer
 
glGetTexImage(GL_TEXTURE_RECTANGLE_ARB, 0, GL_BGRA,
                GL_UNSIGNED_INT_8_8_8_8_REV, pixels);

Double Buffering Texture Data

When you use any technique that allows the GPU to access your texture data directly, such as the texture range extension, it's possible for the GPU and CPU to access the data at the same time. To avoid such a collision, you must synchronize the GPU and the CPU. The simplest way is shown in Figure 11-10. Your application works on the data, flushes it to the GPU and waits until the GPU is finished before working on the data again.

One technique for ensuring that the GPU is finished executing commands before your application sends more data is to insert a token into the command stream and use that to determine when the CPU can touch the data again, as described in Use Fences for Finer-Grained Synchronization. Figure 11-10 uses the fence extension command glFinishObject to synchronize buffer updates for a stream of single-buffered texture data. Notice that when the CPU is processing texture data, the GPU is idle. Similarly, when the GPU is processing texture data, the CPU is idle. It's much more efficient for the GPU and CPU to work asynchronously than to work synchronously. Double buffering data is a technique that allows you to process data asynchronously, as shown in Figure 11-11.

Figure 11-10  Single-buffered data
Single-buffered data

To double buffer data, you must supply two sets of data to work on. Note in Figure 11-11 that while the GPU is rendering one frame of data, the CPU processes the next. After the initial startup, neither processing unit is idle. Using the glFinishObject function provided by the fence extension ensures that buffer updating is synchronized.

Figure 11-11  Double-buffered data
Double-buffered data