OpenGL ES Application Design Guidelines
OpenGL ES performs many complex operations on your behalf—transformations, lighting, clipping, texturing, environmental effects, and so on—on large data sets. The size of your data and the complexity of the calculations performed can impact performance, making your stellar 3D graphics shine less brightly than you'd like. Whether your application is a game using OpenGL ES to provide immersive real-time images to the user or an image processing application more concerned with image quality, use the information in this chapter to help you design your application’s graphics engine. This chapter introduces key concepts that later chapters expand on.
How to Visualize OpenGL ES
There are a few ways you can visualize OpenGL ES, and each provides a slightly different context in which to design and observe your application. The most common way to visualize OpenGL ES is as a graphics pipeline such as the one shown in Figure 6-1. Your application configures the graphics pipeline, and then executes one or more drawing commands. The drawing commands send vertex data down the pipeline, where it is processed, assembled into primitives, and rasterized into fragments. Each fragment calculates color and depth values which are then merged into the framebuffer. Using the pipeline as a mental model is essential for identifying exactly what work your application performs to generate a new frame. In a typical OpenGL ES 2.0 application, your design consists of writing customized shaders to handle the vertex and fragment stages of the pipeline. In an OpenGL ES 1.1 application, you modify the state machine that drives the fixed-function pipeline to perform the desired calculations.
Another benefit of the pipeline model is that individual stages can calculate their results independently and simultaneously. This is a key point. Your application might prepare new primitives while separate portions of the graphics hardware perform vertex and fragment calculations on previously submitted geometry. If any pipeline stage performs too much work or performs too slowly, other pipeline stages sit idle until the slowest stage completes its work. Your design needs to balance the work performed by each pipeline stage by matching calculations to the capabilities of the graphics hardware on the device.
Another way to visualize OpenGL ES is as a client-server architecture, as shown in Figure 6-2. OpenGL ES state changes, texture and vertex data, and rendering commands all have to travel from the application to the OpenGL ES client. The client transforms these data into a format that the graphics hardware understands, and forwards them to the GPU. Not only do these transformations add overhead, but the process of transferring the data to the graphics hardware takes time.
To achieve great performance, an application must reduce the frequency of calls it makes to OpenGL ES, minimize the transformation overhead, and carefully manage the flow of data between itself and OpenGL ES.
Designing a High-Performance OpenGL ES Application
To summarize, a well-designed OpenGL ES application needs to:
Exploit parallelism in the OpenGL ES pipeline.
Manage data flow between the application and the graphics hardware.
Figure 6-3 suggests a process flow for an application that uses OpenGL ES to perform animation to the display.
When the application launches, the first thing it does is initialize resources that it does not intend to change over the lifetime of the application. Ideally, the application encapsulates those resources into OpenGL ES objects. The goal is to create any object that can remain unchanged for the runtime of the application (or even a portion of the application’s lifetime, such as the duration of a level in a game), trading increased initialization time for better rendering performance. Complex commands or state changes should be replaced with OpenGL ES objects that can be used with a single function call. For example, configuring the fixed-function pipeline can take dozens of function calls. Instead, compile a graphics shader at initialization time, and switch to it at runtime with a single function call. OpenGL ES objects that are expensive to create or modify should almost always be created as static objects.
The rendering loop processes all of the items you intend to render to the OpenGL ES context, then presents the results to the display. In an animated scene, some data is updated for every frame. In the inner rendering loop shown in Figure 6-3, the application alternates between updating rendering resources (creating or modifying OpenGL ES objects in the process) and submitting drawing commands that use those resources. The goal of this inner loop is to balance the workload so that the CPU and GPU are working in parallel, preventing the application and OpenGL ES from accessing the same resources simultaneously. On iOS, modifying an OpenGL ES object can be expensive when the modification is not performed at the start or the end of a frame.
An important goal for this inner loop is to avoid copying data back from OpenGL ES to the application. Copying results from the GPU to the CPU can be very slow. If the copied data is also used later as part of the process of rendering the current frame, as shown in the middle rendering loop, your application blocks until all previously submitted drawing commands are completed.
After the application submits all drawing commands needed in the frame, it presents the results to the screen. A non-interactive application would copy the final image to application-owned memory for further processing.
Finally, when your application is ready to quit, or when it finishes with a major task, it frees OpenGL ES objects to make additional resources available, either for itself or for other applications.
To summarize the important characteristics of this design:
Create static resources whenever practical.
The inner rendering loop alternates between modifying dynamic resources and submitting rendering commands. Try to avoid modifying dynamic resources except at the beginning or the end of a frame.
Avoid reading intermediate rendering results back to your application.
The rest of this chapter provides useful OpenGL ES programming techniques to implement the features of this rendering loop. Later chapters demonstrate how to apply these general techniques to specific areas of OpenGL ES programming.
Avoid Synchronizing and Flushing Operations
OpenGL ES is not required to execute most commands immediately. Often, commands are queued to a command buffer and executed by the hardware at a later time. Usually, OpenGL ES waits until the application has queued up a significant number of commands before sending the buffer to the hardware—allowing the graphics hardware to execute commands in batches is often more efficient. However, some OpenGL ES functions must flush the buffer immediately. Other functions not only flush the buffer, but also block until previously submitted commands have completed before returning control over the application. Your application should restrict the use of flushing and synchronizing commands only to those cases where that behavior is necessary. Excessive use of flushing or synchronizing commands may cause your application to stall waiting for the hardware to finish rendering.
These situations require OpenGL ES to submit the command buffer to the hardware for execution.
glFlushsends the command buffer to the graphics hardare. It blocks until commands are submitted to the hardware but does not wait for the commands to finish executing.
glFinishwaits for all previously submitted commands to finish executing on the graphics hardware.
Functions that retrieve OpenGL state (such as
glGetError), also wait for submitted commands to complete.
The command buffer is full.
Using glFlush Effectively
Most of the time you don't need to call
glFlush to move image data to the screen. There are only a few cases where calling the
glFlush function is useful:
If your application submits rendering commands that use a particular OpenGL ES object, and it intends to modify that object in the near future (or vice versa). If you attempt to modify an OpenGL ES object that has pending drawing commands, your application may stall until those drawing commands are completed. In this situation, calling
glFlushensures that the hardware begins processing commands immediately. After flushing the command buffer, your application should perform work that can operate in parallel with the submitted commands.
When two contexts share an OpenGL ES object. After submitting any OpenGL ES commands that modify the object, call
glFlushbefore switching to the other context.
If multiple threads are accessing the same context, only one thread should send commands to OpenGL ES at a time. After submitting commands it must call
Avoid Querying OpenGL ES State
glGetError(), may require OpenGL ES to execute previous commands before retrieving any state variables. This synchronization forces the graphics hardware to run lockstep with the CPU, reducing opportunities for parallelism. To avoid this, maintain your own copy of any state you need to query, and access it directly, rather than calling OpenGL ES.
When errors occur, OpenGL ES sets an error flag that you can retrieve with the function
glGetError. During development, it's crucial that your code contains error checking routines that call
glGetError. If you are developing a performance-critical application, retrieve error information only while debugging your application. Calling
glGetError excessively in a release build degrades performance.
Allow OpenGL ES to Manage Your Resources
OpenGL ES allows many data types to be stored persistently inside OpenGL ES. Creating OpenGL ES objects to store vertex, texture, or other forms of data allows OpenGL ES to reduce the overhead of transforming the data and sending them to the graphics processor. If data is used more frequently than it is modified, OpenGL ES can substantially improve the performance of your application.
OpenGL ES allows your application to hint how it intends to use the data. These hints allow OpenGL ES to make an informed choice of how to process your data. For example, static data might be placed in memory that the graphics processor can readily fetch, or even into dedicated graphics memory.
Use Double Buffering to Avoid Resource Conflicts
Resource conflicts occur when your application and OpenGL ES access an OpenGL ES object at the same time. When one participant attempts to modify an OpenGL ES object being used by the other, they may block until the object is no longer in use. Once they begin modifying the object, the other participant is not allowed to access the object until the modifications are complete. Alternatively, OpenGL ES may implicitly duplicate the object so that both participants can continue to execute commands. Either option is safe, but each can end up as a bottleneck in your application. Figure 6-4 shows this problem. In this example, there is a single texture object, which both OpenGL ES and your application want to use. When the application attempts to change the texture, it must wait until previously submitted drawing commands complete—the CPU synchronizes to the GPU.
To solve this problem, your application could perform additional work between changing the object and drawing with it. But, if your application does not have additional work it can perform, it should explicitly create two identically sized objects; while one participant reads an object, the other participant modifies the other. Figure 6-5 illustrates the double-buffered approach. While the GPU operates on one texture, the CPU modifies the other. After the initial startup, neither the CPU or GPU sits idle. Although shown for textures, this solution works for almost any type of OpenGL ES object.
Double buffering is sufficient for most applications, but it requires that both participants finish processing commands in roughly the same time. To avoid blocking, you can add more buffers; this implements a traditional producer-consumer model. If the producer finishes before the consumer finishes processing commands, it takes an idle buffer and continues to process commands. In this situation, the producer idles only if the consumer falls badly behind.
Double and triple buffering trade off consuming additional memory to prevent the pipeline from stalling. The additional use of memory may cause pressure on other parts of your application. On an iOS device, memory can be scarce; your design may need to balance using more memory with other application optimizations.
Be Mindful of OpenGL ES State Variables
The hardware has one current state, which is compiled and cached. Switching state is expensive, so it's best to design your application to minimize state switches.
Don't set a state that's already set. Once a feature is enabled, it does not need to be enabled again. Calling an enable function more than once does nothing except waste time because OpenGL ES does not check the state of a feature when you call
glDisable. For instance, if you call
glEnable(GL_LIGHTING) more than once, OpenGL ES does not check to see if the lighting state is already enabled. It simply updates the state value even if that value is identical to the current value.
You can avoid setting a state more than necessary by using dedicated setup or shutdown routines rather than putting such calls in a drawing loop. Setup and shutdown routines are also useful for turning on and off features that achieve a specific visual effect—for example, when drawing a wire-frame outline around a textured polygon.
If you are drawing 2D images, disable all irrelevant state variables, similar to what's shown in Listing 6-1.
Listing 6-1 Disabling state variables on OpenGL ES 1.1
// Disable other state variables as appropriate.
Replace State Changes with OpenGL ES Objects
The “Be Mindful of OpenGL ES State Variables” section suggests that reducing the number of state changes can improve performance. Some OpenGL ES extensions also allow you to create objects that collect multiple OpenGL state changes into an object that can be bound with a single function call. Where such techniques are available, they are recommended. For example, configuring the fixed-function pipeline requires many function calls to change the state of the various operators. Not only does this incur overhead for each function called, but the code is more complex and difficult to manage. Instead, use a shader. A shader, once compiled, can have the same effect but requires only a single call to
For another example, vertex array objects allow you configure your vertex attributes once and store them in a vertex array object. See “Consolidate Vertex Array State Changes Using Vertex Array Objects.”
© 2013 Apple Inc. All Rights Reserved. (Last updated: 2013-04-23)