Before you begin implementing a new program, there are several performance enhancements you should consider adding. Although you might not be able to take advantage of all of these enhancements in every case, you should at least consider them during your design phase.
Use Event-Based Handlers
Thread Your Program
Use the Accelerate Framework
Be Lazy
Take Advantage of Perceived Performance
Use the Mach-O Binary Format
All modern Mac OS X applications should be using the Carbon Event Manager or other event-based model for responding to system events. The old way of retrieving events by polling the system is highly inefficient. In fact, when there are no events to process, polling code is a 100 percent waste of CPU time. Using more modern event-based APIs can lead to the following benefits:
It makes your program more responsive to the user.
It reduces your application’s CPU usage.
It minimizes your application’s working set—the number of code pages loaded in memory at any given time.
It allows the system to manage power aggressively.
The Cocoa framework incorporates Carbon Event Manager calls into its classes and methods to implement an event-driven model for you. Applications written in Cocoa automatically take advantage of this behavior and require no additional modifications. Carbon applications must support the Carbon Event Manager calls explicitly.
Event-based handlers are not limited to supporting user events, such as mouse and keyboard events. Each thread has its own run loop to provide on-demand responses to timers, network events, and other incoming data. Applications support run loops using either the Core Foundation (CFRunLoop) or Cocoa (NSRunLoop) interfaces.
Supporting multiple threads is a good way to improve both the perceived and actual performance of your program. On hardware containing multiple processors, a multithreaded program often has significantly better performance than a single-threaded program. By distributing tasks across all available processors, an application can perform multiple operations simultaneously. Even on a single-processor machine, the use of additional threads can provide a perceived speed boost by leaving your main thread free to handle user events.
Before you begin adding support for multiple threads, though, be sure to put some thought into how your program might use those threads effectively. Because threads require a fair amount of overhead to create, you should carefully choose which tasks you want to assign to separate threads. If all of your program’s tasks are small and performed at different times, you would probably not want to create separate threads for each one. Instead, creating a single long-lived worker thread might be more appropriate.
Another consideration with threading is how to protect your data structures. Problems can occur when multiple threads modify the same data without first checking to see if it is safe to do so. Your code needs to use locks rigorously to protect its data structures. You might also need to synchronize specific blocks of code to prevent them from being executed by multiple threads at once.
For information on how to support additional threads in your program, see Threading Programming Guide.
If your application performs a lot of mathematical computations on scalar data, you should consider using the Accelerate framework (Accelerate.framework) to accelerate those calculations. The Accelerate framework takes advantage of any available vector processing units (such as the PowerPC AltiVec extensions, also known as Velocity Engine, or the Intel x86 SSE extensions) to perform multiple calculations in parallel. By coding to the framework, you can avoid having to create separate code paths for each platform architecture. The Accelerate framework is highly tuned for all of the architectures Mac OS X supports.
Tools such as Shark can help point out portions of your program that might benefit from using the Accelerate framework. For more information about Shark and other tools, see “Performance Tools.”
A very simple way to improve performance is to make sure your application does not perform any unnecessary work. Each moment of an application’s time should be spent responding to the user’s current request, not predicting future requests. If you do not need a resource right away, such as a nib file containing a preferences window, don’t load it. Such an action takes time to execute because it accesses the file system, and if the user never opens that preference window, the process of loading its nib file is a waste of time.
The basic rule is wait until the user requests something from your application, then use the necessary resources to fulfill the request. You should cache data only in situations where there is a measurable performance benefit. Preloading caches on the assumption that the rest of the application will run faster can actually degrade performance in low-memory situations. In such a situation, your cached data may be paged to disk before it can be used. Thus, any savings you gained by caching the data turn into a loss because that data must now be read from disk twice before it is ever used. If you really want to cache data, wait until a given operation has been performed once before you cache any data from it.
Some other things to be lazy about include the following:
Defer memory allocation until the point where you actually need the memory.
Don’t zero-initialize blocks of memory. Call the calloc function to do it for you lazily.
Give the system the chance to load your code lazily. Profile and organize your code so that the system loads only the code needed for the current operation.
Defer reading the contents of a file until you actually need the information.
The perception of performance is just as effective as actual performance in many cases. Many program tasks can be performed in the background, on a separate thread, or at idle time. Doing this makes the program interface feel more responsive to the user. Of course, creating the perception of performance does not work in every case. For example, the perception may be lost if the data being processed in the background is needed by the user immediately.
As you design your program, think about which tasks can be moved to the background effectively. For example, if your program needed to scan a number of files, do it on a background thread. Similarly, if you need to perform lengthy calculations, do it in the background so that the user may continue to manipulate your program’s user interface.
Another way to improve perceived performance is to make sure your application launches quickly. At launch time, defer any tasks that do not contribute to the immediate presentation of your application interface. For example, defer the creation of large data structures you do not need immediately until after your application has finished launching. You should also avoid loading plug-ins until the moment their code is actually needed.
If you have a Carbon application that is based on the Code Fragment Manager Preferred Executable Format (PEF), you should consider switching to the Mach-O executable format for several reasons. Foremost among them is that Mach-O is designed and optimized for use with the Mac OS X virtual memory system. Other reasons include the following:
PEF executables are not supported on Intel-based Macintosh computers.
In Mac OS X, the libraries that implement the Carbon environment use the Mach-O executable format. Mach-O executables use a calling convention different from that used by PEF executables. Calls made to or from PEF code fragments must be translated at runtime. While the translation overhead is small, it is unnecessary if you are using Mach-O.
Apple’s Mac OS X development environment supports only Mach-O. Whether or not you use Apple’s development environment for Mac OS X, the Mac OS X performance tools are significantly easier to use with Mach-O executables than with PEF executables.
Mach-O executables can directly call other Mach-O shared libraries and BSD API routines in the kernel.
Mach-O supports just-in-time binding, where a link to a function is resolved when that function is first called. All links in a PEF-based application (and all PEF libraries it links to) must be resolved when the application is launched.
Although Mach-O is not supported in Mac OS 9, using Mach-O does not require you to abandon Mac OS 9 as a delivery platform. You can build an application package that runs a PEF binary in Mac OS 9 and a Mach-O binary in Mac OS X. This allows you to optimize your executable for each operating system that you wish to support. For more information, see Bundle Programming Guide.
For an overview of the Mach-O format and how you can take advantage of that format for performance tuning, see “Overview of the Mach-O Executable Format” in Code Size Performance Guidelines.
Last updated: 2006-10-03