Advanced Techniques

Shell scripts can be powerful tools for writing software. Graphical interfaces notwithstanding, they are capable of performing nearly any task that could be performed with a more traditional language. This chapter describes several techniques that will help you write more complex software using shell scripts.

Using the eval Builtin for Data Structures, Arrays, and Indirection

One of the more under-appreciated commands in shell scripting is the eval builtin. The eval builtin takes a series of arguments, concatenates them into a single command, then executes it.

For example, the following script assigns the value 3 to the variable X and then prints the value:

#!/bin/sh
eval X=3
echo $X

For such simple examples, the eval builtin is superfluous. However, the behavior of the eval builtin becomes much more interesting when you need to construct or choose variable names programmatically. For example, the next script also assigns the value 3 to the variable X:

#!/bin/sh
 
VARIABLE="X"
eval $VARIABLE=3
echo $X

When the eval builtin evaluates its arguments, it does so in two steps. In the first step, variables are replaced by their values. In the preceding example, the letter X is inserted in place of $VARIABLE. Thus, the result of the first step is the following string:

X=3

In the second step, the eval builtin executes the statement generated by the first step, thus assigning the value 3 to the variable X. As further proof, the echo statement at the end of the script prints the value 3.

The eval builtin can be particularly convenient as a substitute for arrays in shell script programming. It can also be used to provide a level of indirection, much like pointers in C. Some examples of the eval builtin are included in the sections that follow.

A Complex Example: Setting and Printing Values of Arbitrary Variables

The next example takes user input, constructs a variable based on the value entered using eval, then prints the value stored in the resulting variable.

#!/bin/sh
echo "Enter variable name and value separated by a space"
read VARIABLE VALUE
echo Assigning the value $VALUE to variable $VARIABLE
eval $VARIABLE=$VALUE
 
# print the value
eval echo "$"$VARIABLE
 
# export the value
eval export $VARIABLE
 
# print the exported variables.
export

Run this script and type something like MYVAR 33. The script assigns the value 33 to the variable MYVAR (or whatever variable name you entered).

You should notice that the echo command has an additional dollar sign ($) in quotes. The first time the eval builtin parses the string, the quoted dollar sign is simplified to merely a dollar sign. You could also surround this dollar sign with single quotes or quote it with a backslash, as described in Quoting Special Characters. The result is the same.

Thus, the statement:

eval echo "$"$VARIABLE

evaluates to:

echo $MYVAR

A Practical Example: Using eval to Simulate an Array

In Shell Variables and Printing, you learned how to read variables from standard input. This was limited to some degree by the inability to read an unknown number of user-entered values.

The script below solves this problem using eval by creating a series of variables to hold the values of a simulated array.

#!/bin/sh
 
COUNTER=0
VALUE="-1"
echo "Enter a series of lines of test.  Enter a blank line to end."
 
while [ "x$VALUE" != "x" ] ; do
        read VALUE
        eval ARRAY_$COUNTER=$VALUE
        eval export ARRAY_$COUNTER
        COUNTER=$(expr $COUNTER '+' 1) # More on this in Paint by Numbers
done
COUNTER=$(expr $COUNTER '-' 1) # Subtract one for the blank value at the end.
 
# print the exported variables.
COUNTERB=0;
 
echo "Printing values."
while [ $COUNTERB -lt $COUNTER ] ; do
        echo "ARRAY[$COUNTERB] = $(eval echo "$"ARRAY_$COUNTERB)"
        COUNTERB=$(expr $COUNTERB '+' 1) # More on this in Paint by Numbers
done

This same technique can be used for splitting an unknown number of input values in a single line as shown in the next listing:

#!/bin/sh
 
COUNTER=0
VALUE="-1"
echo "Enter a series of lines of numbers separated by spaces."
 
read LIST
IFS=" "
for VALUE in $LIST ; do
        eval ARRAY_$COUNTER=$VALUE
        eval export ARRAY_$COUNTER
        COUNTER=$(expr $COUNTER '+' 1) # More on this in Paint by Numbers
done
 
# print the exported variables.
COUNTERB=0;
 
echo "Printing values."
while [ $COUNTERB -lt $COUNTER ] ; do
        echo "ARRAY[$COUNTERB] = $(eval echo '$'ARRAY_$COUNTERB)"
        COUNTERB=$(expr $COUNTERB '+' 1) # More on this in Paint by Numbers
done

A Data Structure Example: Linked Lists

In a complex shell script, you may need to keep track of multiple pieces of data and treat them like a data structure. The eval builtin makes this easy. Your code needs to pass around only a single name from which you build other variable names to represent fields in the structure.

Similarly, you can use the eval builtin to provide a level of indirection similar to pointers in C.

For example, the following script manually constructs a linked list with three items, then walks the list:

#!/bin/sh
 
VAR1_VALUE="7"
VAR1_NEXT="VAR2"
 
VAR2_VALUE="11"
VAR2_NEXT="VAR3"
 
VAR3_VALUE="42"
 
HEAD="VAR1"
POS=$HEAD
while [ "x$POS" != "x" ] ; do
        echo "POS: $POS"
        VALUE="$(eval echo '$'$POS'_VALUE')"
        echo "VALUE: $VALUE"
        POS="$(eval echo '$'$POS'_NEXT')"
done

Using this technique, you could conceivably construct any data structure that you need (with the caveat that manipulating large data structures in shell scripts is generally not conducive to good performance).

A Powerful Example: Binary Search Trees

Working with Binary Search Trees in Starting Points provides a ready-to-use binary search tree library written as a Bourne shell script.

Trapping Signals

No discussion of advanced programming would be complete without an explanation of signal handling. In UNIX-based and UNIX-like operating systems, signals provide a primitive means of interprocess communication. A script or other process can send a signal to another process by either using the kill command or by calling the kill function in a C program. Upon receipt, the receiving process either exits, ignores the signal, or executes a signal handler routine of the author’s choosing.

Signals are most frequently used to terminate execution of a process in a friendly way, allowing that process the opportunity to clean up before it exits. However, they can also be used for other purposes. For example, when a terminal window changes in size, any running shell in that window receives a SIGWINCH (window change) signal. Normally, this signal is ignored, but if a program cares about window size changes, it can trap that signal and handle it in an application-specific way. With the exception of the SIGKILL signal, any signal can be trapped and handled by calling the C function signal.

In much the same way, shell scripts can also trap signals and perform operations when they occur, through the use of the trap builtin.

The syntax of trap is as follows:

trap subroutine signal [ signal ... ]

The first argument is the name of a subroutine that should be called when the specified signals are received. The remaining arguments contain a space-delimited list of signal names or numbers. Because signal numbers vary between platforms, for maximum readability and portability, you should always use signal names.

For example, if you want to trap the SIGWINCH (window change) signal, you could write the following statement:

trap sigwinch_handler SIGWINCH

After you issue this statement, the shell calls the subroutine sigwinch_handler whenever it receives a SIGWINCH signal. The script in Listing 11-1 prints the phrase “Window size changed.“ whenever you adjust the size of your terminal window.

Listing 11-1  Installing a signal handler trap

#!/bin/sh
 
fixrows()
{
        echo "Window size changed."
}
 
echo "Adjust the size of your window now."
trap fixrows SIGWINCH
 
COUNT=0
while [ $COUNT -lt 60 ] ; do
        COUNT=$(($COUNT + 1))
        sleep 1
done

Sometimes, instead of trapping a signal, you may want to ignore a signal entirely. To do this, specify an empty string for the subroutine name. For example, the code in Listing 11-2 ignores the “interrupt” signal generated when you press Control-C:

Listing 11-2  Ignoring a signal

#!/bin/sh
trap "" SIGINT
 
echo "This program will sleep for 10 seconds and cannot be killed with"
echo "control-c."
sleep 10

Finally, signals can be used as a primitive form of interscript communication. The next two scripts work as a pair. To see this in action, first save the script in Listing 11-3 as ipc1.sh and the script in Listing 11-4 as ipc2.sh.

Listing 11-3  ipc1.sh: Script interprocess communication example, part 1 of 2

#!/bin/sh
 
## Save this as ipc1.sh
 
./ipc2.sh &
 
PID=$!
 
sleep 1 # Give it time to launch.
 
kill -HUP $PID

Listing 11-4  ipc2.sh: Script interprocess communication example, part 2 of 2

#!/bin/sh
 
## Save this as ipc2.sh
 
hup_handler()
{
        echo "SIGHUP RECEIVED."
        exit 0
}
 
trap hup_handler SIGHUP
 
while true ; do
        sleep 1
done

Now run ipc1.sh. It launches the script ipc2.sh in the background, uses the special shell variable $! to get the process ID of the last background process (ipc2.sh in this case), then sends it a hangup (SIGHUP) signal using kill.

Because the second script, ipc2.sh, trapped the hangup signal, its shell then calls a handler subroutine, hup_handler. This subroutine prints the words “SIGHUP RECEIVED.“ and exits.

Shell Text Formatting

One powerful technique when writing shell scripts is to take advantage of the terminal emulation features of your terminal application (whether it is Terminal, an xterm, or some other application) to display formatted content.

You can use the printf command to easily create columnar layouts without any special tricks. For more visually exciting presentation, you can add color or text formatting such as boldface or underlined display using ANSI (VT100/VT220) escape sequences.

In addition, you can use ANSI escape sequences to show or hide the cursor, set the cursor position anywhere on the screen, and set various text attributes, including boldface, inverse, underline, and foreground and background color.

Using the printf Command for Tabular Layout

Much like C and other languages, most operating systems that support shell scripts also provide a command-line version of printf. This command differs from the C printf function in a number of ways. These differences include the following:

  • The %c directive does not perform integer-to-character conversion. The only way to convert an integer to a character with the shell version is to first convert the integer into octal and then print it by using the octal value as a switch. For example, printf "\144" prints the lowercase letter d.

  • The command-line version supports a much smaller set of placeholders. For example, %p (pointers) does not exist in the shell version.

  • The command-line version does not have a notion of long or double-precision numbers. Although flags with these modifiers are allowed (%lld, for example), the modifiers are ignored. Thus, there is no difference between %d, %ld, and %lld.

  • Large integers may be truncated to 32-bit signed values.

  • Double-precision floating-point values may be reduced to single-precision values.

  • Floating point precision is not guaranteed (even for single-precision values) because some imprecision is inherent in the conversion between strings and floating-point numbers.

Much like the printf statement in other languages, the shell script printf syntax is as follows:

printf "format string" argument ...

Like the C printf function, the command-line printf format string contains some combination of text, switches (\n and \t, for example), and placeholders (%d, for example).

The most important feature of printf for tabular layouts is the padding feature. Between the percent sign and the type letter, you can place a number to indicate the width to which the field should be padded. For a floating-point placeholder (%f), you can optionally specify two numbers separated by a decimal point. The leftmost value indicates the total field width, while the rightmost value indicates the number of decimal places that should be included. For example, you can print pi to three digits of precision in an 8-character-wide field by typing printf "%8.3f" 3.14159265.

In addition to the width of the padding, you can add certain prefixes before the field width to indicate special padding requirements. They are:

  • Minus sign (-)—indicates the field should be left justified. (Fields are right justified by default.)

  • Plus sign (+)—indicates that a sign should be prepended to a numerical argument even if it has a positive value.

  • Space—indicates that a space should be added to a numerical argument in place of the sign if the value is positive. (A plus sign takes precedence over a space.)

  • Zero (0)—indicates that numerical arguments should be padded with leading zeroes instead of spaces. (A minus sign takes precedence over a zero.)

For example, if you want to create a four-column table of name, address, phone number, and GPA, you might write a statement like this:

Listing 11-5  Columnar printing using printf

#/bin/sh
 
NAME="John Doe"
ADDRESS="1 Fictitious Rd, Bucksnort, TN"
PHONE="(555) 555-5555"
GPA="3.885"
printf "%20s | %30s | %14s | %5s\n" "Name" "Address" "Phone Number" "GPA"
printf "%20s | %30s | %14s | %5.2f\n" "$NAME" "$ADDRESS" "$PHONE" "$GPA"

The printf statement pads the fields into neat columns and truncates the GPA to two decimal places, leaving room for three additional characters (the decimal point itself, the ones place, and a leading space). You should notice that the additional arguments are all surrounded by quotation marks. If you do not do this, you will get incorrect behavior because of the spaces in the arguments.

The next sample shows number formatting:

#!/bin/sh
 
GPA="3.885"
 
printf "%f | whatever\n" "$GPA"
printf "%20f | whatever\n" "$GPA"
printf "%+20f | whatever\n" "$GPA"
printf "%+020f | whatever\n" "$GPA"
printf "%-20f | whatever\n" "$GPA"
printf "%- 20f | whatever\n" "$GPA"

This prints the following output:

3.885000 | whatever
            3.885000 | whatever
           +3.885000 | whatever
+000000000003.885000 | whatever
3.885000             | whatever
 3.885000            | whatever

Most of the same formatting options apply to %s and %d (including, surprisingly, zero-padding of string arguments). For more information, see the manual page for printf.

Truncating Strings

To truncate a value to a given width, you can use a simple regular expression to keep only the first few characters. For example, the following snippet copies the first seven characters of a string:

STRING="whatever you want it to be"
TRUNCSTRING="`echo "$STRING" | sed 's/^\(.......\).*$/\1/'`"
echo "$TRUNCSTRING"

As an alternative, you can use a more general-purpose routine such as the one in Listing 11-6, which truncates a string to an arbitrary length by building up a regular expression.

Listing 11-6  Truncating text to column width

trunc_field()
{
    local STR=$1
    local CHARS=$2
    local EXP=""
    local COUNT=0
    while [ $COUNT -lt $CHARS ] ; do
        EXP="$EXP."
        COUNT=`expr $COUNT + 1`
    done
    echo $STR | sed "s/^\($EXP\).*$/\1/"
}

printf "%10s | something\n" "`trunc_field "$TEXT" 20`"

Of course, you can do this much faster by either caching these strings or replacing most of the subroutine with a single line of Perl:

echo "$STR" | perl -e "$/=undef; print substr(<STDIN>, 0, $CHARS);"

Finally, if you are willing to write code that is extremely nonportable (using a syntax that does not even work in ZSH), you can use BASH-specific substring expansion:

echo "${STR:0:8}"

You can learn about similar operations in the manual page for bash under the “Parameter Expansion” heading. As a general rule, however, you should avoid such shell-specific tricks.

Using ANSI Escape Sequences

You can use ANSI escape sequences to add color or formatting to text displayed in the terminal, reposition the cursor, set tab stops, clear portions of the display, change scrolling behavior, and more. This section includes a partial list of many commonly used escape sequences, along with examples of how to use them.

There are two ways to generate escape sequences: direct printing and using the terminfo database. Printing the sequences directly has significant performance advantages but is less portable because it assumes that all terminals are ANSI/VT100/VT220-compliant. A good compromise is to combine these two approaches by caching the values generated with a terminfo command such as tput at the beginning of your script and then printing the values directly elsewhere in the script.

Generating Escape Sequences using the terminfo Database

Generating escape sequences with the terminfo database is relatively straightforward once you know what terminal capabilities to request. You can find several tables containing capability information, along with the standard ANSI/VT220 values for each capability, in ANSI Escape Sequence Tables. (Note that not all ANSI escape sequences have equivalent terminfo capabilities, and vice versa.)

Once you know what capability to request (along with any additional arguments that you must specify), you can use the tput command to output the escape sequence (or capture the output of tput into a variable so you can use it later). For example, you can clear the screen with the following command:

tput cl

Some terminfo database entries contain placeholders for numeric values, such as row and column information. The easiest way to use these is to specify those numeric values on the command line when calling tput. However, for performance, it may be faster to substitute the values yourself. For example, the capability cup sets the cursor position to a row and column value. The following command sets the position to row 3, column 7:

tput cup 3 7

You can, however, obtain the unsubstituted string by requesting the capability without specifying row and column parameters. For example:

tput cup | less

By piping the data to less, you can see precisely what the tput tool is providing, and you can look up the parameters in the manual page for terminfo. This particular example prints the following string:

^[[%i%p1%d;%p2%dH

The %i notation means that the first two (and only the first two) values are one greater than you might otherwise expect. (For ANSI terminals, columns and rows number from 1 rather than from 0). The %p1%d means to push parameter 1 onto the stack and then print it immediately. The parameter %p2%d is the equivalent for parameter 2.

As you can see from even this relatively simple example, the language used for terminfo is quite complex. Thus, while it may be acceptable to perform the substitution for simple terminals such as VT100 yourself, you may still be trading performance for portability. In general, it is best to let tput perform the substitutions on your behalf.

Generating Escape Sequences Directly

To use an ANSI escape sequence without using tput, you must first be able to print an escape character from your script. There are three ways to do this:

  • Use printf to print the escape sequence. In a string, the \e switch prints an escape character. This is the easiest way to print escape sequences.

    For example, the following snippet shows how to print the reset sequence (^[c):

    printf "\ec" # resets the screen
  • Embed the escape character in your script. The method of doing this varies widely from one editor to another. In most text-based editors and on the command line itself, you do this by pressing Control-V followed by the Esc key. Although this is the fastest way to print an escape sequence, it has the disadvantage of making your script harder to edit.

    For example, you might write a snippet like this one:

    echo "^[c" # Read the note below!!!
  • Use printf to store an escape character into a variable. This is the recommended technique because it is nearly as fast as embedding the escape character but does not make the code hard to read and edit.

    For example, the following code sends a terminal reset command (^[c):

    #!/bin/sh
     
    ESC=`printf "\e"`       # store an escape character
                            # into the variable ESC
    echo "$ESC""c"          # Echo a terminal reset command.

Because the terminal reset command is one of only a handful of escape sequences that do not start with a left square bracket, it is worth pointing out the two sets of double-quote marks after the variable in the above example. Without those, the shell tries to print the value of the variable ESCc, which does not exist.

ANSI Escape Sequence Tables

There are four basic categories of escape codes:

  • Cursor manipulation routines (described in Table 11-1) allow you to move the cursor around on the screen, show or hide the cursor, and limit scrolling to only a portion of the screen.

  • Attribute manipulation sequences (described in Attribute and Color Escape Sequences) allow you to set or clear text attributes such as underlining, boldface display, and inverse display.

  • Color manipulation sequences (described in Attribute and Color Escape Sequences) allow you to change the foreground and background color of text.

  • Other escape codes (described in Table 11-4) support clearing the screen, clearing portions of the screen, resetting the terminal, and setting tab stops.

Cursor and Scrolling Manipulation Escape Sequences

The terminal window is divided into a series of rows and columns. The upper-left corner is row 1, column 1. The lower-right corner varies depending on the size of the terminal window.

You can obtain the current number of rows and columns on the screen by examining the values of the shell variables LINES and COLUMNS. Thus, the screen coordinates range from (1, 1) to ($LINES, $COLUMNS). In most modern Bourne shells, the values for LINES and COLUMNS are automatically updated when the window size changes. This is true for both BASH and ZSH shells.

If you want to be particularly clever, you can also trap the SIGWINCH signal and update your script’s notion of lines and columns when it occurs. See Trapping Signals for more information.

Once you know the number of rows and columns on your screen, you can move the cursor around with the escape sequences listed in Table 11-1. For example, to set the cursor position to row 4, column 5, you could issue the following command:

printf "\e[4;5H"

For other, faster ways to print escape sequences, see Generating Escape Sequences Directly.

Table 11-1  Cursor and scrolling manipulation escape sequences

Terminfo capability

Escape sequence

Description

tivis

Note: The terminfo entry for Terminal does not support this option.

^[[?25l

Hides the cursor.

tvvis

Note: The terminfo entry for Terminal does not support this option.

^[[?25h

Shows the cursor.

cup r c

^[[r;cH

Sets cursor position to row r, column c.

(no equivalent)

^[[6n

Reports current cursor position as though typed from the keyboard (reported as ^[[r;cR). Note: it is not practical to capture this information in a shell script.

sc

^[7

Saves current cursor position and style.

rc

^[8

Restores previously saved cursor position and style.

cuu r

^[[rA

Moves cursor up r rows.

cud r

^[[rB

Moves cursor down r rows.

cuf c

^[[cC

Moves cursor right c columns.

cub c

^[[cD

Moves cursor left c columns.

(no equivalent)

^[[7h

Disables automatic line wrapping when the cursor reaches the right edge of the screen.

(no equivalent)

^[[7l

Enables line wrapping (on by default).

(no equivalent)

^[[r

Enables whole-screen scrolling (on by default).

(no equivalent)

^[[S;Er

Enables partial-screen scrolling from row S to row E and moves the cursor to the top of this region.

do

^[D

Moves the cursor down by one line.

up

^[M

Moves the cursor up by one line.

Attribute and Color Escape Sequences

Attribute and color escape sequences allow you to change the attributes or color for text that you have not yet drawn. No escape sequence (scrolling notwithstanding) changes anything that has already been drawn on the screen. Escape sequences apply only to subsequent text.

For example, to draw a red “W” character, first send the escape sequence to set the foreground color to red (^[[31m), then print a “W” character, then send an attribute reset sequence (^[[m), if desired.

The attribute and color escape codes can be combined with other attribute and color escape codes in the form ^[[#;#;#;...#m. For example, you can combine the escape sequences ^[[1m (bold) and ^[[32m green text) into the sequence ^[[1;32m. Listing 11-8 prints a familiar phrase in multiple colors.

Listing 11-8  Using ANSI color

#!/bin/sh
 
printf '\e[41mH\e[42me\e[43ml\e[44;32ml\e[45mo\e[m \e[46;33m'
printf 'W\e[47;30mo\e[40;37mr\e[49;39ml\e[41md\e[42m!\e[m\n'

Table 11-2 contains a list of capabilities and escape sequences that control text style.

Table 11-2  Attribute escape sequences

Terminfo capability

Escape sequence

Description

Resetting attributes

me

^[[m or ^[[0m

Resets all attributes to their default values.

Setting attributes

bold

^[[1m

Enables “bold” display. This code and code #2 (dim) are mutually exclusive.

dim

^[[2m

Enables “dim” display. This code and code #1 (bold) are mutually exclusive. Not supported in Terminal.

so

Note: In the terminfo database entry for Terminal, this attribute is mapped to inverse because the VT100 “standout” mode is not supported.

^[[3m

Enables “standout” display. Not supported in Terminal.

us

^[[4m

Enables underlined display.

blink

Note: The terminfo entry for Terminal does not support this option.

^[[5m

<blink>.

(No equivalent.)

^[[6m

Fast blink or strike-through. (Not supported in Terminal; behavior inconsistent elsewhere.)

mr

^[[7m

Enables reversed (inverse) display.

invis

Note: The terminfo entry for Terminal does not support this option.

^[[8m

Enables hidden (background-on-background) display.

^[[9m

Unused.

Codes 10m19m

Font selection codes. Unsupported in most terminal applications, including Terminal.

Clearing attributes

(No equivalent.)

^[[20m

“Fraktur” typeface. Unsupported almost universally, and Terminal is no exception.

^[[21m

Unused.

se

Note: Technically, this capability is supposed to end standout mode, but it is overloaded to disable bold bright/dim mode as well.

^[[22m

Disables “bright” or “dim” display. This disables either code 1m or 2m.

se

^[[23m

Disables “standout” display. Not supported in Terminal.

ue

^[[24m

Disables underlined display.

(No equivalent. Use me to disable all attributes instead.)

^[[25m

</blink>. Also disables slow blink or strike-through (6m) on terminals that support that attribute.

^[[26m

Unused.

(No equivalent. Use me to disable all attributes instead.)

^[[27m

Disables reversed (inverse) display.

(No equivalent. Use me to disable all attributes instead.)

^[[28m

Disables hidden (background-on-background) display.

^[[29m

Unused.

Table 11-3 contains a list of capabilities and escape sequences that control text and background colors.

Table 11-3  Color escape sequences

Terminfo capability

Escape sequence

Description

Foreground colors

setaf 0

^[[30m

Sets foreground color to black.

setaf 1

^[[31m

Sets foreground color to red.

setaf 2

^[[32m

Sets foreground color to green.

setaf 3

^[[33m

Sets foreground color to yellow.

setaf 4

^[[34m

Sets foreground color to blue.

setaf 5

^[[35m

Sets foreground color to magenta.

setaf 6

^[[36m

Sets foreground color to cyan.

setaf 7

^[[37m

Sets foreground color to white.

^[[38m

Unused.

setaf 9

^[[39m

Sets foreground color to the default.

Background colors

setab 0

^[[40m

Sets background color to black.

setab 1

^[[41m

Sets background color to red.

setab 2

^[[42m

Sets background color to green.

setab 3

^[[43m

Sets background color to yellow.

setab 4

^[[44m

Sets background color to blue.

setab 5

^[[45m

Sets background color to magenta.

setab 6

^[[46m

Sets background color to cyan.

setab 7

^[[47m

Sets background color to white.

^[[48m

Unused.

setab 9

^[[49m

Sets background color to the default.

Other Escape Sequences

In addition to providing text formatting, ANSI escape sequences provide the ability to reset the terminal, clear the screen (or portions thereof), clear a line (or portions thereof), and set or clear tab stops.

For example, to clear all existing tab stops and set a single tab stop at column 20, you could use the snippet show in Listing 11-9.

Listing 11-9  Setting tab stops

#!/bin/sh
echo # Start on a new line
printf "\e[19C" # move right 19 columns to column 20
printf "\e[3g" # clear all tab stops
printf "\e[W" # set a new tab stop
printf "\e[19D" # move back to the left
printf "Tab test\tThis starts at column 20."

Table 11-4 contains a list of capabilities and escape sequences that perform other miscellaneous tasks such as cursor control, tab stop manipulation, and clearing the screen or portions thereof.

Table 11-4  Other escape codes

Terminfo capability

Escape sequence

Description

Resetting the terminal

reset

Note: This resets many more things than ^[c. It is also technically not a single capability but rather the concatenation of rs1, rs2, and rs3.

^[c

Resets the background and foreground colors to their default values, clears the screen, and moves the cursor to the home position.

Clearing the screen

cd

^[[J or ^[[0J

Clears to the bottom of the screen using the current background color.

(no equivalent)

^[[1J

Clears to the top of the screen using the current background color.

cl

^[[2J

Clears the screen to the current background color. On some terminals, the cursor is reset to the home position.

Clearing the current line

ce

^[[K or ^[[0K

Clears to the end of the current line.

cb—Not supported in terminfo entry for Terminal.

^[[1K

Clears to the beginning of the current line.

(no equivalent)

^[[2K

Clears the current line.

Tab stops

hts

^[[W or ^[[0W

Set horizontal tab at cursor position.

(no equivalent)

^[[1W

Set vertical tab at current line. (Not supported in Terminal.)

Codes 2W6W

Redundant codes equivalent to codes 0g3g.

(no equivalent)

^[[g or ^[[0g

Clear horizontal tab at cursor position.

(no equivalent)

^[[1g

Clear vertical tab at current line. (Not supported in Terminal.)

(no equivalent)

^[[2g

Clear horizontal and vertical tab stops for current line only. (Not supported in Terminal.)

tbc

^[[3g

Clear all horizontal tabs.

For More Information

The tables in this chapter provide only some of the more commonly used escape sequences and terminfo capabilities. You can find an exhaustive list of ANSI escape sequences at http://www.inwap.com/pdp10/ansicode.txt and an exhaustive list of terminfo capabilities in the manual page for terminfo.

Before using capabilities or escape sequences not in this chapter, however, you should be aware that most terminal software (including Terminal in OS X) does not support the complete set of ANSI escape sequences or terminfo capabilities.

Nonblocking I/O

Most shell scripts do not need to accept user input at all during execution, and scripts that do require user input can generally request it a line at a time. However, if you are writing a shell script that needs to interact with the user while performing background activity, it can be convenient to simulate asynchronous timer events and asynchronous input and output.

First, a warning: nonblocking I/O is not possible in a pure shell script. It requires the use of an external tool that sets the terminal to nonblocking. Setting the terminal to nonblocking can seriously confuse the shell, so you should not mix nonblocking I/O and blocking I/O in the same program.

With that caveat, you can perform nonblocking I/O by writing a small C helper such as this one:

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
 
int main(int argc, char *argv[])
{
    int ch;
    int flags = fcntl(STDIN_FILENO, F_GETFL);
    if (flags == -1) return -1; // error
 
    fcntl(STDIN_FILENO, F_SETFL, flags | O_NONBLOCK);
 
    ch = fgetc(stdin);
    if (ch == EOF) return -1;
    if (ch == -1) return -1;
    printf("%c", ch);
    return 0;
}

If you compile this tool and name it getch, you can then use it to perform nonblocking terminal input, as shown in the following example:

#!/bin/bash
 
stty -icanon -isig
while true ; do
        echo -n "Enter a character: "
        CHAR=`./getch`
        if [ "x$CHAR" = "x" ] ; then
                echo "NO DATA";
        else
                if [ "x$CHAR" = "xq" ] ; then
                        stty -cbreak
                        exit
                fi
                echo "DATA: $CHAR";
        fi
        sleep 1;
done
 
# never reached
stty -cbreak

This script prints “NO DATA” or “DATA: [some character]” depending on whether you have pressed a key in the past second. (To stop the script, press the Q key.) Using the same technique, you can write fairly complex shell scripts that can detect keystrokes while performing other tasks. For example, you might write a game of ping pong that checks for a keystroke at the beginning of each ball drawing loop and if it detects one, moves the user’s paddle by a few pixels.

This script also illustrates another useful technique: disabling input buffering. The stty command changes three settings on the controlling terminal (a device file that represents the current Terminal window, console, ssh session, or other communication channel):

Depending on what you are doing, you may also find it useful to pass the -echo flag. This flag disables the automatic echo of typed characters to the screen. If you are capturing characters for a full-screen game, for example, echoing the typed characters to the screen tends to be disastrous, depending on how unlucky the user’s timing is when pressing the key.

Depending on what other flags you pass, you may want to reset the terminal more fully at the end by issuing the command stty sane. In OS X, this flag is identical to -cbreak, but in Linux and some other operating systems, the sane flag is a superset of the -cbreak flag.

Timing Loops

On rare occasions, you may find the need to perform some operation on a periodic basis with greater than the one second precision offered by sleep. Although the shell does not offer any precision timers, you can closely approximate such behavior through the use of a calibrated delay loop.

The basic design for such a loop consists of two parts: a calibration routine and a delay loop. The calibration routine should execute approximately the same instructions as the delay loop for a known number of iterations.

The nature of the instructions within the delay loop are largely unimportant. They can be any instructions that your program needs to execute while waiting for the desired amount of time to elapse. However, a common technique is to perform nonblocking I/O during the delay loop and then process any characters received.

For example, Listing 11-10 shows a very simple timing loop that reads a byte and triggers some simple echo statements (depending on what key is pressed) while simultaneously echoing a statement to the screen about once per second.

Listing 11-10  A simple one-second timing loop

#!/bin/sh
 
ONE_SECOND=1000
 
read_test()
{
    COUNT=0
    local ONE_SECOND=1000                       # ensure this never trips!
    while [ $COUNT -lt 200 ] ; do
        CHAR=`./getch`
        if [ $1 = "rot" ] ; then
                CHAR=","
        fi
        case "$CHAR" in
                ( "q" | "Q" )
                        CONT=0;
                        GAMEOVER=1
                ;;
                ( "" )
                        # Silently ignore empty input.
                ;;
                ( * )
                        echo "Unknown key $CHAR"
                ;;
        esac
        COUNT=`expr $COUNT '+' 1`
        while [ $COUNT -ge $ONE_SECOND ] ; do
                COUNT=`expr $COUNT - $ONE_SECOND`
                MODE="clear";
                draw_cur $ROT;
                VPOS=`expr $VPOS '+' 1`
                MODE="apple";
                draw_cur $ROT
        done
    done
}
 
calibrate_timers()
{
    2>/tmp/readtesttime time $0 -readtest
    local READ_DUR=`grep real /tmp/readtesttime | sed 's/real.*//' | tr -d ' '`
    # echo "READ_DUR: $READ_DUR"
 
    local READ_SINGLE=`echo "scale=20; ($READ_DUR / 200)" | bc`
    ONE_SECOND=`echo "scale=0; 1.0  / $READ_SINGLE" | bc`
 
    # echo "READ_SINGLE: $READ_SINGLE";
    # exit
 
    echo "One second is about $ONE_SECOND cycles."
}
 
if [ "x$1" = "x-readtest" ] ; then
        read_test
        exit
fi
 
echo "Calibrating.  Please wait."
calibrate_timers
 
echo "Done calibrating.  You should see a message about once per second.  Press 'q' to quit."
stty -icanon -isig
 
GAMEOVER=0
COUNT=0
# Start the game loop.
while [ $GAMEOVER -eq 0 ] ; do
        # echo -n "Enter a character: "
        CHAR=`./getch`
        case "$CHAR" in
                ( "q" | "Q" )
                        CONT=0;
                        GAMEOVER=1
                ;;
                ( "" )
                        # Silently ignore empty input.
                ;;
                ( * )
                        echo "Unknown key $CHAR"
                ;;
        esac
        COUNT=`expr $COUNT '+' 1`
        while [ $COUNT -ge $ONE_SECOND ] ; do
                COUNT=`expr $COUNT - $ONE_SECOND`
                echo "One second elapsed (give or take)."
        done
done
 
stty sane

In a real-world timing loop, you will probably have keys that perform certain operations that take time—moving a piece on a checkerboard, for example. In that case, your calibration should also perform a series to tests to approximate the amount of time for each of those operations.

If you divide the time for the slow operation by the duration of a single read operation (READ_SINGLE), you can discern an approximate penalty for the move using iterations of the main program loop as the unit value. Then, when you perform one of those operations later, you simply add that penalty value to the main loop counter, thus ensuring that the "One second elapsed” messages will quickly catch up with (approximately) where they should be.

You can approximate this further by using larger numbers in your loop counter to achieve greater precision. For example, you might increment your loop counter by 100 instead of by 1. This will give a much more accurate approximation of the number of cycles stolen by a slow operation.

Background Jobs and Job Control

For end-user convenience in the days of text terminals before the advent of tools like screen, the C shell contains job control features that allow you to start a process in the background, then go off and work on other things, bringing these background tasks into the foreground, suspending foreground tasks to complete them later, and continuing these suspended tasks as background tasks.

Over the years, many modern Bourne shell variants including bash and zsh have added similar support. The details of using these commands from the command line is beyond the scope of this document, but in brief, control-Z suspends the foreground process, fg brings a suspended or background job to the foreground, and bg causes a job to begin executing in the background.

Up until this point, all of the scripts have involved a single process operating in the foreground. Indeed, most shell scripts operate in this fashion. Sometimes, though, parallelism can improve performance, particularly if the shell script is spawning a processor-hungry task. For this reason, this section describes programmatic ways to take advantage of background jobs in shell scripts.

To start a process running in the background, add an ampersand at the end of the statement. For example:

sleep 10 &

This will start a sleep process running in the background and will immediately return you to the command line. Ten seconds later, the command will finish executing, and the next time you hit return after that, you will see its exit status. Depending on your shell, it will look something like this:

[1]+  Done                    sleep 10

This indicates that the sleep command completed execution. A related feature is the wait builtin. This command causes the shell to wait for a specified background job to complete. If no job is specified, it will wait until all background jobs have finished.

The next example starts several commands in the background and waits for them to finish.

#!/bin/bash
 
delayprint()
{
    local TIME;
    TIME=$1
    echo "Sleeping for $TIME seconds."
    sleep $TIME
    echo "Done sleeping for $TIME seconds."
}
 
delayprint 3 &
delayprint 5 &
delayprint 7 &
wait

This script is a relatively simple example. It executes three commands at once, then waits until all of them have completed. This may be sufficient for some uses, but it leaves something to be desired, particularly if you care about whether the commands succeed or fail.

The following example is a bit more complex. It shows two different techniques for waiting for jobs. You should generally use the process ID when waiting for a child process. You can obtain the process ID of the last command using the $! shell variable.

If, however, you need to inspect a job using the jobs builtin, you must use the job ID. It can be somewhat clumsy to obtain a job ID because the job control mechanism in most Bourne shell variants was designed primarily for interactive use rather than programmatic use. Fortunately, there are few things that a well-written regular expression can’t fix.

#!/bin/bash
 
jobidfromstring()
{
        local STRING;
        local RET;
 
        STRING=$1;
        RET="$(echo $STRING | sed 's/^[^0-9]*//' | sed 's/[^0-9].*$//')"
 
        echo $RET;
}
 
delayprint()
{
        local TIME;
        TIME=$1
        echo "Sleeping for $TIME seconds."
        sleep $TIME
        echo "Done sleeping for $TIME seconds."
}
 
# Use the job ID for this one.
delayprint 3 &
DP3=`jobidfromstring $(jobs %%)`
 
# Use the process ID this time.
delayprint 5 &
DP5=$!
 
delayprint 7 &
DP7=`jobidfromstring $(jobs %%)`
 
echo "Waiting for job $DP3";
wait %$DP3
 
echo "Waiting for process ID $DP5";
# No percent because it is a process ID
wait $DP5
 
echo "Waiting for job $DP7";
wait %$DP7
 
echo "Done."

This example passes a job number or process ID argument to the jobs builtin to tell it which job you want to find out information about. Job numbers begin with a percent (%) sign and are normally followed by a number.

In the case, however, a second percent sign is used. The %% job is one of a number of special job “numbers” that the shell provides. It tells the jobs builtin to output information about the last command that was executed in the background. The result of this jobs command is a status string like the one shown earlier. This string is passed as a series of arguments to the jobidfromstring subroutine, which then prints the job ID by itself. The output of this subroutine, in turn, is stored into either the variable DP3 or DP7.

This example also demonstrates how to wait for a job based on process ID using a special shell variable, $!, which contains the process ID of the last command executed. This value is stored into the variable DP5. Process IDs are generally preferred over job IDs when using the jobs command in scripts (as opposed to hand-entered use of the jobs command).

Finally, the script ends with a series of calls to the wait builtin. These commands tell the shell to wait for a child process to exit. When a child process exits, the shell reaps the process, stores its exit status in the $? variable, and returns control to the script..

Like the jobs command, the wait builtin can take a job ID or process ID. If you specify a job or process ID, the shell does not return control to the script until the specified job or process exits. If no process or job ID is specified, the wait builtin returns as soon as the first child exits.

A job ID consists of a percent sign followed by the job number (obtained from either the variable DP3 or DP7). A process ID is just the number itself.

The final example shows how to execute a limited number of concurrent jobs in which the order of job completion is not important.

#!/bin/bash
 
MAXJOBS=3
 
spawnjob()
{
    echo $1 | bash
}
 
clearToSpawn()
{
    local JOBCOUNT="$(jobs -r | grep -c .)"
    if [ $JOBCOUNT -lt $MAXJOBS ] ; then
        echo 1;
        return 1;
    fi
 
    echo 0;
    return 0;
}
 
JOBLIST=""
 
COMMANDLIST='ls
echo "sleep 3"; sleep 3; echo "sleep 3 done"
echo "sleep 10"; sleep 10 ; echo "sleep 10 done"
echo "sleep 1"; sleep 1; echo "sleep 1 done"
echo "sleep 5"; sleep 5; echo "sleep 5 done"
echo "sleep 7"; sleep 7; echo "sleep 7 done"
echo "sleep 2"; sleep 2; echo "sleep 2 done"
'
 
IFS="
"
 
for COMMAND in $COMMANDLIST ; do
    while [ `clearToSpawn` -ne 1 ] ; do
        sleep 1
    done
    spawnjob $COMMAND &
    LASTJOB=$!
    JOBLIST="$JOBLIST $LASTJOB"
done
 
IFS=" "
 
for JOB in $JOBLIST ; do
    wait $JOB
    echo "Job $JOB exited with status $?"
done
 
echo "Done."

Most of the code here is straightforward. It is worth noting, however, that in the subroutine clearToSpawn, the -r flag must be passed to the jobs builtin to restrict output to currently running jobs. Without this flag, the jobs builtin would otherwise return a list that included completed jobs, thus making the count of running jobs incorrect.

The -c flag to grep causes it to return the number of matching lines rather than the lines themselves, and the period causes it to match on any nonblank lines (those containing at least one character). Thus, the JOBCOUNT variable contains the number of currently running jobs, which is, in turn, compared to the value MAXJOBS to determine whether it is appropriate to start another job or not.

Application Scripting With osascript

OS X provides a powerful application scripting environment called AppleScript. With AppleScript, you can launch an application, tell a running application to perform various tasks, query a running application in various ways, and so on. Shell script programmers can harness this power through the osascript tool.

The osascript tool executes a program in the specified language and prints the results via standard output. If no program file is specified, it reads the program from standard input.

The first example is fairly straightforward. It opens the file poem.txt in the directory above the directory where the script is located:

Listing 11-11  Opening a file using AppleScript and osascript: 07_osascript_simple.sh

#!/bin/sh
 
POEM="$PWD/../poem.txt"
 
cat << EOF | osascript -l AppleScript
launch application "TextEdit"
tell application "TextEdit"
        open "$POEM"
end tell
EOF

You should notice that the path to the file poem.txt is specified as an absolute path here. This is crucial when working with osascript. Because the current working directory of a launched application is always the root of the file system (the / directory) rather than the shell script’s working directory, a script must pass an absolute path to AppleScript rather than a path relative to the script’s working directory.

The next example shows how to query an application. In this case, it launches TextEdit, opens two files, asks TextEdit for a list of open documents, and uses that list to help it ask TextEdit to return the first paragraph of text in the document that corresponds with the poem.txt file.

Listing 11-12  Working with a file using AppleScript and osascript: 08_osascript_para.sh

#!/bin/sh
 
# Get an absolute path for the poem.txt file.
POEM="$PWD/../poem.txt"
 
# Get an absolute path for the script file.
SCRIPT="$(which $0)"
if [ "x$(echo $SCRIPT | grep '^\/')" = "x" ] ; then
    SCRIPT="$PWD/$SCRIPT"
fi
 
# Launch TextEdit and open both the poem and script files.
cat << EOF | osascript -l AppleScript > /dev/null
launch application "TextEdit"
tell application "TextEdit"
    open "$POEM"
end tell
 
set myDocument to result
return number of myDocument
EOF
 
cat << EOF | osascript -l AppleScript > /dev/null
launch application "TextEdit"
tell application "TextEdit"
        open "$SCRIPT"
end tell
 
set myDocument to result
return number of myDocument
EOF
 
 
# Tell the shell not to mangle newline characters, tabs, or whitespace.
IFS=""
 
# Ask TextEdit for a list of open documents.  From this, we can
# obtain a document number that corresponds with the poem.txt file.
# This query returns a newline-deliminted list of open files. Each
# line contains the file number, followed by a tab, followed by the
# filename
DOCUMENTS="$(cat << EOF | osascript -l AppleScript
 
    tell application "TextEdit"
        documents
    end tell
 
    set myList to result         -- Store the result of "documents" message into variable "myList"
    set myCount to count myList  -- Store the number of items in myList into myCount
    set myRet to ""              -- Create an empty string variable called "myRet"
 
    (* Loop through the myList array and build up a string in the myRet variable
       containing one line per entry in the form:
 
        number tab_character name
      *)
    repeat with myPos from 1 to myCount
        set myRet to myRet & myPos & "\t" & name of item myPos of myList & "\n"
    end repeat
    return myRet
EOF
)"
 
# Determine the document number that corresponds with the poem.txt
# file.
DOCNUMBER="$(echo $DOCUMENTS | grep '[[:space:]]poem\.txt' | grep -v ' poem\.txt' | head -n 1 | sed 's/\([0-9][0-9]*.\).*/\1/')"
SECOND_DOCNUMBER="$(echo $DOCUMENTS | grep '[[:space:]]poem\.txt' | grep -v ' poem\.txt' | tail -n 1 | sed 's/\([0-9][0-9]*.\).*/\1/')"
 
if [ $DOCNUMBER -ne $SECOND_DOCNUMBER ] ; then
    echo "WARNING: You have more than one file named poem.txt open.  Using the" 1>&2
    echo "most recently opened file." 1>&2
    echo "DOCNUMBER $DOCNUMBER != $SECOND_DOCNUMBER"
fi
 
echo "DOCNUMBER: $DOCNUMBER"
 
if [ "x$DOCNUMBER" != "x" ] ; then
    # Query poem.txt by number
    FIRSTPARAGRAPH="$(cat << EOF | osascript -l AppleScript
        tell application "TextEdit"
            paragraph 1 of document $DOCNUMBER
        end tell
EOF
    )"
    echo "The first paragraph of poem.txt is:"
    echo "$FIRSTPARAGRAPH"
fi
 
# Query poem.txt by name
FIRSTPARAGRAPH="$(cat << EOF | osascript -l AppleScript
        tell application "TextEdit"
            paragraph 1 of document "poem.txt"
        end tell
EOF
)"
echo "The first paragraph of poem.txt is:"
echo "$FIRSTPARAGRAPH"

This script illustrates three very important concepts.

The final example shows how to manipulate images using shell scripts and AppleScript. It scales the image to be as close to 320x480 or 480x320 (depending on the orientation of the image) as possible.

Listing 11-13  Resizing an image using Image Events and osascript: 09_osascript_images.sh

#!/bin/sh
 
# Get an absolute path for the poem.txt file.
 
MAXLONG=480
MAXSHORT=320
 
URL="http://images.apple.com/macpro/images/design_smartdesign_hero20080108.png"
FILE="$PWD/my design_smartdesign_hero20080108.png"
OUTFILE="$PWD/my design_smartdesign_hero20080108-mini.png"
 
if [ ! -f "$FILE" ] ; then
    curl "$URL" > "$FILE"
fi
 
# Tell the shell not to mangle newline characters, tabs, or whitespace.
IFS=""
 
# Obtain image information
DIM="$(cat << EOF | osascript -l AppleScript
tell application "Image Events"
    launch
    set this_image to open "$FILE"
    copy dimensions of this_image to {W, H}
    close this_image
end tell
return W & H
EOF
)"
 
W="$(echo "$DIM" | sed 's/ *, *.*//' )"
H="$(echo "$DIM" | sed 's/.* *, *//' )"
 
echo WIDTH: $W HEIGHT: $H
 
if [ $W -gt $H ] ; then
    LONG=$W
    SHORT=$H
else
    LONG=$H
    SHORT=$W
fi
 
# echo "LONG: $LONG SHORT: $SHORT"
# echo "MAXLONG: $MAXLONG MAXSHORT: $MAXSHORT"
 
NEWLONG=$LONG
NEWSHORT=$SHORT
# NEWSCALE=1
 
if [ $NEWLONG -gt $MAXLONG ] ; then
    # Long direction is too big.
    NEWLONG="$(echo "scale=20; $LONG * ($MAXLONG/$LONG)" | bc | sed 's/\..*//')";
    NEWSHORT="$(echo "scale=20; $SHORT * ($MAXLONG/$LONG)" | bc | sed 's/\..*//')";
    NEWSCALE="$(echo "scale=20; ($MAXLONG/$LONG)" | bc)";
fi
 
# echo "PART 1: NEWLONG: $NEWLONG NEWSHORT: $NEWSHORT"
 
if [ $NEWSHORT -gt $MAXSHORT ] ; then
    # Short direction is till too big.
    NEWLONG="$(echo "scale=20; $LONG * ($MAXSHORT/$SHORT)" | bc | sed 's/\..*//')";
    NEWSHORT="$(echo "scale=20; $SHORT * ($MAXSHORT/$SHORT)" | bc | sed 's/\..*//')";
    NEWSCALE="$(echo "scale=20; ($MAXSHORT/$SHORT)" | bc)";
fi
 
# echo "PART 2: NEWLONG: $NEWLONG NEWSHORT: $NEWSHORT"
 
if [ $W -gt $H ] ; then
    NEWWIDTH=$NEWLONG
    NEWHEIGHT=$NEWSHORT
else
    NEWHEIGHT=$NEWLONG
    NEWWIDTH=$NEWSHORT
fi
 
echo "DESIRED WIDTH: $NEWWIDTH NEW HEIGHT: $NEWHEIGHT (SCALE IS $NEWSCALE)"
 
cp "$FILE" "$OUTFILE"
 
DIM="$(cat << EOF | osascript -l AppleScript
tell application "Image Events"
    launch
    set this_image to open "$OUTFILE"
    scale this_image by factor $NEWSCALE
    save this_image with icon
    copy dimensions of this_image to {W, H}
    close this_image
end tell
return W & H
EOF
)"
 
GOTW="$(echo "$DIM" | sed 's/ *, *.*//' )"
GOTH="$(echo "$DIM" | sed 's/.* *, *//' )"
 
echo "NEW WIDTH: $GOTW NEW HEIGHT: $GOTH"
 

Of course, you could just as easily perform these calculations in AppleScript itself, but this demonstrates how easy it is for shell scripts to exchange information with AppleScript code, manipulate image files, and tell applications to perform other complex tasks.

For more information about manipulating images with Image Events, see http://www.apple.com/applescript/imageevents/. You can also find many other AppleScript examples at http://www.apple.com/applescript/examples.html.

Scripting Interactive Tools Using File Descriptors

Most of the time, you should use expect scripts or C programs to control interactive tools. However, it is sometimes possible, albeit sometimes difficult, to script such interactive tools (if their output is line-based). This section explains the techniques you use.

Creating Named Pipes

Before you can communicate with a tool in a continuous round-trip fashion, you must create a pair of FIFOs (short for first-in, first-out, otherwise known as named pipes) using the mkfifo command. For example, to create named pipes called /tmp/infifo and /tmp/outfifo, you would issue the following commands:

mkfifo /tmp/infifo
mkfifo /tmp/outfifo

To see this in action using the sed command as a filter, type the following commands:

mkfifo /tmp/outfifo
sed 's/a/b/' < /tmp/outfifo &
echo "This is a test" > /tmp/outfifo

Notice that sed exits after receiving the data and printing This is b test to the screen. The echo command opens the output FIFO, writes the data, and closes the FIFO. As soon as it closes the FIFO, the sed command gets a SIGPIPE signal and (usually) terminates. To use a command-line tool as a filter and keep passing data to it, you must make sure that you don't close the FIFO until you are finished using the filter. To achieve this, you must use file descriptors, as described in the next section.

Opening File Descriptors for Reading and Writing

As explained in Creating Named Pipes, sending data to a named pipe with command-line tools causes the command to terminate after the first message. To prevent this, you must open a file descriptor in the shell to provide continuous access to the named pipe.

You can open a file descriptor for writing to the output FIFO as follows:

exec 8> /tmp/outfifo

This command opens file descriptor 8 and redirects it to the file /tmp/outfifo.

Similarly, you can open a descriptor for reading like this:

exec 9<> /tmp/infifo

You can write data to an open descriptor like this:

# Write a string to descriptor 8
echo "This is a test." >&8

You can read a line from an open descriptor like this:

# Read a line from descriptor 9 and store the result in variable MYLINE
read MYLINE <&9

When you have finished writing data to the filter, you should close the pipes and delete the FIFO files as follows:

exec 8>&-
exec 9<&-
rm /tmp/infifo
rm /tmp/outfifo

Table 11-5 summarizes the operations you can perform on file descriptors. The next section contains a complete working example.

Table 11-5  Shell file descriptor operators

Operator

Equivalent C code

n<> "filename"

fd = open("filename", O_RDWR|O_CREAT);

dup2(fd, n);

close(fd);

n> "filename"

fd = open("filename", O_WRONLY|O_CREAT|O_TRUNC);

dup2(fd, n);

close(fd);

n>> "filename"

fd = open("filename", O_WRONLY|O_APPEND|O_CREAT);

dup2(fd, n);

close(fd);

n<&o

n>&o

dup2(o, n);

Note: Although these operators behave identically, for readability, you should use the <& operator for read-only or read-write descriptors and the >& for write-only descriptors.

n<&-

n<&-

close(n);

Using Named Pipes and File Descriptors to Create Circular Pipes

There’s just one more problem. The sed command buffers its input by default. This can cause problems when using it as a filter. Thus, you must tell the sed command to not buffer its input by specifying the -l flag (or the -u flag for GNU sed).

The following listing demonstrates these techniques. It runs sed, then sends two strings to it, then reads back the two filtered strings, then sends a third string, then reads the third filtered string back, then closes the pipes.

Listing 11-14  Using FIFOs to create circular pipes

#!/bin/sh
 
# Create two FIFOs (named pipes)
INFIFO="/tmp/infifo.$$"
OUTFIFO="/tmp/outfifo.$$"
mkfifo "$INFIFO"
mkfifo "$OUTFIFO"
 
# OS X and recent *BSD sed uses -l for line-buffered mode.
BUFFER_FLAG="-l"
 
# GNU sed uses -u for "unbuffered" mode (really line-buffered).
if [ "x$(sed --version 2>&1 | grep GNU)" != "x" ] ; then
    BUFFER_FLAG="-u"
fi
 
# Set up a sed substitution input from the input fifo otput to
sed $BUFFER_FLAG 's/a test/not a test/' < $INFIFO > $OUTFIFO &
PID=$!
 
# Open a file descriptor (#8) to write to the input FIFO
exec 8> $INFIFO
 
# Open a file descriptor (#9) to read from the output FIFO.
exec 9<> $OUTFIFO
 
# Send two lines of text to the running copy of sed.
echo "This is a test." >&8
echo "This is maybe a test." >&8
 
# Read the first two lines from sed's output.
read A <&9
echo "Result 1: $A"
read A <&9
echo "Result 2: $A"
 
# Send another line of text to the running copy of sed.
echo "This is also a test." >&8
 
# Read it back.
read A <&9
echo "Result 3: $A"
 
# Show that sed is still running.
ps -p $PID
 
# Close the pipes to terminate sed.
exec 8>&-
exec 9<&-
 
# Show that sed is no longer running.
ps -p $PID
 
# Clean up the FIFO files in /tmp
rm "$INFIFO"
rm "$OUTFIFO"
 

Networking With Shell Scripts

By building on the concepts in Using Named Pipes and File Descriptors to Create Circular Pipes, you can easily write scripts that communicate over the Internet using TCP/IP using the netcat utility, nc. This utility is commonly available in various forms on different platforms, and the available flags vary somewhat from platform to platform.

The following listing shows how to write a very simple daemon based on netcat that works portably. It listens on port 4242. When a client connects, it reads a line of text, then sends the client the same line, only backwards. It repeats this process until the client closes the connection.

Listing 11-15  A simple daemon based on netcat

#!/bin/sh
 
INFIFO="/tmp/infifo.$$"
OUTFIFO="/tmp/outfifo.$$"
 
# /*! Cleans up the FIFOs and kills the netcat helper. */
cleanup_daemon()
{
    rm -f "$INFIFO" "$OUTFIFO"
 
    if [ "$NCPID" != "" ] ; then
        kill -TERM "$NCPID"
    fi
 
    exit
}
 
# /*! @abstract Attempts to reconnect after a sigpipe. */
reconnect()
{
        PSOUT="$(ps -p $NCPID | tail -n +2 | tr -d '\n')"
        if [ "$PSOUT" = "" ] ; then
                cleanup_shttpd
        fi
        closeConnection 8 "$INFIFO"
}
 
trap cleanup_daemon SIGHUP
trap cleanup_daemon SIGTERM
trap reconnect SIGPIPE
trap cleanup_daemon SIGABRT
trap cleanup_daemon SIGTSTP
# trap cleanup_daemon SIGCHLD
trap cleanup_daemon SIGSEGV
trap cleanup_daemon SIGBUS
trap cleanup_daemon SIGQUIT
trap cleanup_daemon SIGINT
 
mkfifo "$INFIFO"
mkfifo "$OUTFIFO"
 
# /*! Reverses a string. */
reverseit()
{
    STRING="$1"
 
    REPLY=""
 
    while [ "$STRING" != "" ] ; do
        FIRST="$(echo "$STRING" | cut -c '1')"
        STRING="$(echo "$STRING" | cut -c '2-')"
        REPLY="$FIRST$REPLY"
    done
 
    echo "$REPLY"
}
 
while true ; do
    CONNECTED=1
    nc -l 4242 < $INFIFO > $OUTFIFO &
    NCPID=$!
 
    exec 8> $INFIFO
    exec 9<> $OUTFIFO
 
    while [ $CONNECTED = 1 ]  ; do
            read -u9 -t1 REQUEST
 
        if [ $? = 0 ] ; then
            # Read didn't time out.
            reverseit "$REQUEST" >&8
            echo "GOT REQUEST $REQUEST"
        fi
 
        CONNECTED="$(jobs -r | grep -c .)"
    done
done
 

This daemon is designed to be portable, which limits the flags it can use. As a result, it can only handle a single client at any given time, with a minimum of a one second period between connection attempts. This is the easiest way to use the netcat utility. For a more complex example, see A Shell-Based Web Server.

You can also use netcat as a networking client in much the same way. You might send a request to a web server, a mail server, or other daemon. Of course, you are generally better off using existing clients such as curl or sendmail, but when that is not possible, netcat provides a solution.

The following listing connects to the daemon shown in Listing 11-15, requests input from the user, sends the input to the remote daemon, reads the result, and prints it to standard output.

Listing 11-16  A simple client based on netcat

#!/bin/sh
 
INFIFO="/tmp/infifo.$$"
OUTFIFO="/tmp/outfifo.$$"
 
INFIFO="/tmp/infifo.$$"
OUTFIFO="/tmp/outfifo.$$"
 
# /*! Cleans up the FIFOs and kills the netcat helper. */
cleanup_client()
{
    rm -f "$INFIFO" "$OUTFIFO"
 
    if [ "$NCPID" != "" ] ; then
        kill -TERM "$NCPID"
    fi
 
    exit
}
 
# /*! @abstract Attempts to reconnect after a sigpipe. */
reconnect()
{
        PSOUT="$(ps -p $NCPID | tail -n +2 | tr -d '\n')"
        if [ "$PSOUT" = "" ] ; then
                cleanup_shttpd
        fi
        closeConnection 8 "$INFIFO"
}
 
trap cleanup_client SIGHUP
trap cleanup_client SIGTERM
trap reconnect SIGPIPE
trap cleanup_client SIGABRT
trap cleanup_client SIGTSTP
trap cleanup_client SIGCHLD
trap cleanup_client SIGSEGV
trap cleanup_client SIGBUS
trap cleanup_client SIGQUIT
trap cleanup_client SIGINT
 
mkfifo "$INFIFO"
mkfifo "$OUTFIFO"
 
nc localhost 4242 < $INFIFO > $OUTFIFO &
NCPID=$!
 
exec 8> $INFIFO
exec 9<> $OUTFIFO
 
while true ; do
    printf "String to reverse -> "
        read STRING
    echo "$STRING" >&8
    read -u9 REVERSED
    echo "$REVERSED"
done