Here are a few more performance tuning tips.
Background or Defer Output
Defer Potentially Unnecessary Work
Perform Comparisons Only Once
Choose Control Statements Carefully
Perform Computations Only Once
Use Shell Builtins Wherever Possible
For Maximum Performance, Use Shell Math, Not External Tools
Combine Multiple Expressions with sed
Output to files takes time, output to the console doubly so. If you are writing code where performance is a consideration, you should either execute output commands in the background by adding an ampersand (&) to the end of the command or group multiple output statements together.
For example, if you are drawing a game board, the fastest way is to store your draw commands in a single variable and output the data at once. In this way, you avoid taking multiple execution penalties. A very fast way to do this is to disable buffering and set newline to shift down a line without returning to the left edge (run stty raw to set both of these parameters), then store the first row into a variable, followed by a newline, followed by backspace characters to shift left to the start of the next row, followed by the next row, and so on.
If the results of a series of instructions may never be used, do not perform those instructions.
For example, consider code that uses eval to obtain the values from a series of variables in a pseudo-array. Suppose that the code returns immediately if any of the variables has a value of 2 or more.
Unless you are accumulating multiple assignment statements into a single eval statement (as described in “Reducing Use of the eval Builtin”), you should call eval on the first statement by itself, make the comparison, run eval for the next statement, and so on. By doing so, you are reducing the average number of calls to eval.
If you have a function that performs an expensive test two or more times, cache the results of that test and perform the most lightweight comparison possible from then on.
Also, if you have two possible execution paths through your code that share some code in common, it may be faster to use only a single if statement and duplicate the small amount of common code rather than repeatedly performing the same comparison. In general, however, such changes will only result in a single-digit percentage improvement in performance, so it is usually not worth the decrease in maintainability to duplicate code in this way.
The performance impact varies depending on the expense of the test. Tests that perform computations or outside execution are particularly expensive and thus should be minimized as much as possible. Of course, you can reduce the additional impact by performing the calculation once and doing a lightweight test multiple times.
A simple test case produced the results shown in Table 8-1.
Test performed twice with one copy of shared code in-between | Test performed once with two copies of shared code |
|---|---|
7.003 | 6.957 |
In most situations, the appropriate control statement is obvious. To test to see whether a variable contains one of two or three values, you generally choose an if statement with a small number of elif statements. For larger number of values, you generally choose a case statement. This not only leads to more readable code, but also results in faster code.
For small numbers of cases (5), as expected, the difference between a series of if statements, an if statement with a series of elif statements, and a case statement is largely lost in the noise, performance-wise, even after 1000 iterations. Although the results shown in Table 8-2 are in the expected order, this was only true approximately half the time. For a smaller number of cases, the differences can largely be ignored.
eval statement executing multiple functions | series of if statements | if, then series of elif statements | case statement | |
|---|---|---|---|---|
Five cases | 6.945 | 6..846 | 6.831 | 6.807 |
Ten cases | 7.094 | 7.224 | 6.980 | 6.903 |
Fifty cases | 7.023 | 8.03 | 7.392 | 6.704 |
With a larger number of cases, the results more predictably resemble what one would expect. The case version is fastest, followed by the elif version, followed by the if version, with the eval version still coming in last. These results tended to be more consistent, though eval was often faster than the series of if statements.
Although the performance differences (shown in Table 8-2) are relatively small, in a sufficiently complex script with a large number of cases, they can make a sizable difference. In particular, the case statement tends to degrade more gracefully, whereas the series of if statements by themselves tends to cause an ever-increasing performance penalty.
For example, if you have a function that includes expr $ROW + 1 in two or more lines of code, you should define a local variable ROW_PLUS_1 and store the value of the expression in that variable. Caching the results of computation is particularly important if you are using expr for more portable math, but doing so consistently results in a small performance improvement even when using shell math.
Twice with expr | Once with expr | Twice with shell math | Once with shell math |
|---|---|---|---|
23.744 | 12.820 | 6.596 | 6.486 |
Using echo by itself is typically about 30 times faster than explicitly executing /bin/echo. This improved performance also applies to other builtins such as umask or test.
Of course, test is particularly important because it doubles as the bracket ([) command, which is essential for most control statements in the shell. If you explicitly wrote a control statement using /bin/[, your performance would degrade immensely, Fortunately, it is unlikely that anyone would ever do that accidentally.
echo (builtin) | /bin/echo | printf (builtin) | /usr/bin/printf |
|---|---|---|---|
0.285 | 6.212 | 0.230 | 6.359 |
On a related note, the printf builtin is significantly faster than the echo builtin. Thus, for maximum performance, you should use printf instead of echo.
Although significantly less portable, code that uses the zsh- and bash-specific $(( $VAR + 1)) math notation executes up to 125 times faster than identical code written with the expr command and up to 225 times faster than identical code written with the bc command.
Use expr in preference to bc for any integer math that exceeds the capabilities of the shell’s math capabilities. The floating-point math used by bc tends to be significantly slower.
shell math | expr command | bc command |
|---|---|---|
0.111 | 14.106 | 25.008 |
The sed tool, like any other external tool, is expensive to start up. If you are processing a large chunk of data, this penalty is lost in the noise, but if you are processing a short quantity of data, it can be a sizable percentage of script execution time. Thus, if you can process multiple regular expressions in a single instance of sed, it is much faster than processing each expression separately.
Consider, for example, the following code, which changes “This is a test” into “This is burnt toast” and then throws away the results by redirecting them to /dev/null.
function1() |
{ |
LOOP=0 |
while [ $LOOP -lt 1000 ] ; do |
echo "This is a test." | sed 's/a/burnt/g' | sed 's/e/oa/g' > /dev/null |
LOOP=$((LOOP + 1)) |
done |
} |
You can speed this up dramatically by rewriting the processing line to look like this:
echo "This is a test." | sed -e 's/a/burnt/g' -e 's/e/oa/g' > /dev/null |
By passing multiple expressions to sed, it processes them in a single execution. In this case, the processing of the second expression can be reduced by more than 60% on a typical computer.
As explained in “Avoiding Unnecessary External Commands,” you can improve performance further by concatenating these strings into a single string and processing the output of all 1000 lines in a single invocation of sed (with two expressions). This change reduces the total execution time by nearly a factor of 20 compared with the original version.
For small inputs, the execution penalty is relatively large, so combining expressions results in a significant improvement. For large inputs, the execution penalty is relatively small, so combining expressions generally results in negligible improvement. However, even with large inputs, if the sed statements are executed in a loop, the cumulative performance difference could be noticeable.
Two calls per line (2000 calls total) | One call per line (1000 calls total) | Two calls on accumulated text | One call on accumulated text | |
|---|---|---|---|---|
Single-processor system | 16.874 | 9.983 | 0.670 | 0.665 |
Dual-processor system | 11.460 | 8.143 | 0.619 | 0.612 |
Last updated: 2008-04-08