Program Control

As with virtually all programming languages, the Unix shell provides the capability for conditional logic statements. Unlike other languages however, this capability comes in two distinct components, a testing component and a control component. These separate components can be used independently or in combination with each other.

Conditional Testing Statements

The first component used to build conditional statements is the test command. The test command is typically referenced using pairs of square brackets, that is [ ]. Test is used to evaluate the logical condition of its enclosed arguments and therefore always returns a boolean value (as before, TRUE implies an exit status of zero, FALSE implies an exit status of non-zero). The test command is a very powerful and flexible command which requires careful attention to detail when used. The generic syntax of the test command looks as follows:

 	[ condition to test ]

The conditions for which test can check are quite numerous (and are partially listed in the table below). Starting with a simple example, consider the following:

	$ [ -f /etc/passwd ] [Enter]	
	$ echo $? [Enter]
	0

In this example, when used with the -f option, test returns a TRUE status if the file /etc/passwd exists and is an ordinary file, otherwise it returns FALSE (indicated by a non-zero status). Thus, in this example, the /etc/passwd file does exist as an ordinary file as evident by the zero return status.

It is important to note from the start that test is quite the persnickety command. Notice the spaces following the open bracket and preceding the closing bracket. These are mandatory and frequently the source of errors reported by test. Consider the following example noting the missing space before the closing bracket:

	$ [ -f /etc/passwd] [Enter]	
	ksh: [: missing ]

In this example, test fails to parse the command as entered and the shell returns a rather cryptic error message. Whenever errors result from using the test command, it is a good practice to examine your usage of test, paying close attention to syntax, specifically spacing.

Looking at another example:

	$ DAY=Tue [Enter]			    
	$ [ $DAY = `date | cut -c1-3` ] [Enter]

Here, the the first statement assigns the three character string "Tue" to the shell variable DAY. In the second command, the test command compares the equality of the variable $DAY with the result of the date command piped through the cut filter, keeping only the first three characters. Test will return a TRUE value if the actual day of the week is Tuesday, and FALSE otherwise. As mentioned above, pay close attention to the spaces following the open bracket and preceding the closing bracket, as well as the spaces surrounding the equal sign.¹

At this point I would like to introduce an important concept with respect to the test operator, namely possible states of evaluation. As mentioned earlier, test is a boolean operator returning results of TRUE or FALSE. However, another scenario that may occur is that of command failure. Test will always return TRUE or FALSE, since it is a boolean operator. That being stated, the test operation can result in success and return TRUE (zero), test can result in non-success and return FALSE (non-zero), or test can have erroneous evaluation resulting in command failure as shown above (which also returns a non-zero status). This latter scenario is typically not a desirable outcome from the test command and should be anticipated and handled.

An example of command failure can be shown by the following. Most test comparisons mandate a certain number of operands. For example, the string equality operator (=) in test requires two operands, string1 and string2, as in the table below. Thus if the variable DAY has no value (which is not the same as the NULL string), the string equality operator will not see two operands (since the first is nothing), and behave as following:

	$ unset DAY [Enter]
	$ [ $DAY = `date | cut -c1-3` ] [Enter]
	ksh: [: Wed: unexpected operator/operand

In this example, test does not see two operands and reports an error, since the first operand has no value. A good habit, strongly encouraged to adopt to avoid such errors, is to use double quotes to enclose any operands (i.e. variables) which may somehow not be assigned a value. This effectively changes the value of nothing to the value of nothing within a string, commonly referred to as the NULL string. While this will return a FALSE result from test, it will not issue an error (see the three test resulting states above). Thus the above example with quotes:

	$ unset DAY [Enter]
	$ [ "$DAY" = `date | cut -c1-3` ] [Enter]
	$ echo $? [Enter]
	1

Frequently used test operators are listed in the table below (see the man page for comprehensive details) :

File Operators returns TRUE if

-d file file exists and is a directory

-f file file exists and is an ordinary file

-r file file exists and is readable

-w file file exists and is writable

-x file file exists and is executable

-s file file exists and its size is non-zero

Integer Operators returns TRUE if

int₁ -eq int₂ int₁ is equal to int₂

int₁ -ne int₂ int₁ is not equal to int₂

int₁ -lt int₂ int₁ is less than int₂

int₁ -le int₂ int₁ is less than or equal to int₂

int₁ -gt int₂ int₁ is greater than int₂

int₁ -ge int₂ int₁ is greater than or equal to int₂

String Operators returns TRUE if

string₁ = string₂ string₁ is equal to string₂

string₁ != string₂ string₁ is not equal to string₂

-z string string is null (and must be seen)

Logical Operators returns TRUE if

! expr expr is FALSE, otherwise returns TRUE

expr₁ -o expr₂ expr₁ is TRUE OR expr₂ is TRUE

expr₁ -a expr₂ expr₁ is TRUE AND expr₂ is TRUE

¹Note the subtle contextual differences in the use of the = operator in this example. In the first statement, the = operator is used in the context of assignment. When looking closely, this is apparent due to the lack of spaces surrounding the = operator, thus right to left assignment (recall). In the second statement, the = operator is used in the context of string equality, internal to the test command. This should be apparent since the = operator is enclosed within test brackets, as well as the test mandatory spaces surrounding the equal sign.

Conditional Control Statements

The other component used when constructing conditional statements is the logical control component. The most frequently used control structure is the if-then statement (in its many varieties). The generic syntax for the basic if-then statement is:

	if boolean_condition
	then
	   statement
	   optional statement(s)
	fi

Examine the following example:

	if [ -f /etc/passwd ]
	then
	   echo "The file /etc/passwd exists!"
	fi

You should understand the the echo statement will be executed if the test command returns a TRUE (zero return status) status should it determines the /etc/passwd file exists and is an ordinary file. If the file does not exist, this statement will do nothing. This leads to next variety of the if-then statement, the if-then-else variety. General syntax of the if-then-else is:

	if boolean_condition
	then
	   statement
	   optional statement(s)
	else
	   statement
	   optional statement(s)
	fi

In this form, if the boolean_condition is TRUE, the statement(s) following the then will execute and if the boolean_condition is FALSE, the statements following the else will execute. Thus we can modify our example as follows:

	if [ -f /etc/passwd ]
	then
	   echo "The file /etc/passwd exists!"
	else
	   echo "The file /etc/passwd does not exist!"
	fi

Even more power and flexibility can be added to the if statement by adding multiple else-if clause blocks, logically based on differing conditions. This is done with the elif keyword as follows:

	if boolean_condition₁
	then
	   statement_block₁
	elif boolean_condition₂
	then
	   statement_block₂
	elif boolean_condition_n
	then
	   statement_block_n
	else
	   default statement(s)
	fi

Thus if boolean_condition₁ is TRUE, statement_block₁ is executed, otherwise, boolean_condition₂ will be evaluated, if it is TRUE, the statement_block₂ is executed, and so on. As soon as any condition evaluates to TRUE, all other logical evaluations are halted. Notice there must be a then statement following each elif. Notice also in the above example the final else statement. This is commonly referred to as a default condition, since if none of the prior boolean conditions are TRUE, the default statements will execute. The default else statement is optional, but it is a good practice to use this to trap and control erroneous condition values.

As seen previously, a technique sometimes used with if-then-fi (and other block) statements is the use of the semicolon (;) to place multiple statement on a single line. For example, the if-then statement shown above can be written as follows:

	if [ -f /etc/passwd ] ; then
	   echo "The file /etc/passwd exists!"
	fi

Some programmers feel very strongly about lining up the if with the fi statement. This is a technique of tidiness and does not offer any behavior or performance improvements. Note that if you forget the semicolon, you will get an error.

Another related, albeit it a bit different conditional command provided by the shell is the case command. The case command attempts to match a single expression to various patterns listed within the case block. The shell first evaluates the expression; if the evaluated expression matches the first pattern within the case block, the list of statements following that pattern are executed. If the expression does not match the first pattern, the match is applied to the second pattern, and so forth. Each command list must be terminated pair of semicolons (;;). The general syntax for the case command is:

	case expression in
	
	     pattern₁) statement
		       optional statement(s);;
	     pattern₂) statement
		       optional statement(s);;
             pattern_n) statement
		       optional statement(s);;

	esac

Once the list of commands for a particular pattern have completed, control is transfered to the end of the case command (the esac statement). Looking at a simple example, we see the following usage of the case statement:

	case "$1" in
	    yes) echo "yes was command argument 1" 
		   ;;
	    no)  echo "no was command argument 1" ;;
	esac

The expression in this example is the first command line argument stored the the positional parameter $1 (notice the enclosing "" and recall why). The first pattern in this example is the string pattern "yes". If the first command line argument is yes, the echo command following is executed. Notice the statement terminators are on a separate line in this example. The second pattern is the string pattern "no"; if no is entered as the first command line argument, the second echo statement is executed.

A simple but powerful modification to this code segment is the following:

	case "$1" in
	    yes) echo "yes was command argument 1" 
		   ;;
	    no)  echo "no was command argument 1" ;;

	    *)   echo "Enter yes or no please!"
	esac

The wildcard character matches everything that is not matched by any patterns above. This serves as a default condition for all other patterns not specifically included in the defined patterns. This is often used to catch invalid or unexpected input.

One other modification worthy of mention is the use of more sophisticated patterns, in this case, the use of the vertical bar (|) to imply logical or.

	 case "$1" in
	    yes|YES) echo "yes was command argument 1" 
		   ;;
	    no|NO)  echo "no was command argument 1" ;;

	    *)   echo "Enter yes or no please!"
	esac

In the above example, the first pattern match will be successful if the user entered the string "yes" or the string "YES". You should observe similar results with the second pattern matching the strings "no" or "NO". Additional pattern matching details will be provided later in this hypertext.

Iterative Statements

Iterative statements, commonly referred to as looping statements, involve executing a series of statements zero or more times. We will discuss two specific shell looping statements, the while loop and the for loop.

The while loop executes the statement block enclosed within zero or more times as long as the boolean condition (or command) evaluates to TRUE (which again means returns a zero exit status). The general syntax of the while loop is:

	while boolean_condition
	do
	   statement
	   optional statement(s)
	done

As with the if-then statement, the boolean_condition is usually (but not always) evaluated using the test command. Note also that the body of the while is enclosed with the do-done statements. Thus if we wanted to find a sum of the numbers between 1 and 5, we could do this with the following while loop:

	COUNT=1
	SUM=0
	while [ "$COUNT" -le 5 ]
	do
	   SUM=`expr $SUM + $COUNT`		# or SUM=$(( SUM + COUNT ))
	   COUNT=`expr $COUNT + 1`
	done

Note the double quotes surrounding the COUNT variable; these are to make sure this variable is seen by test, even if it has no value, to avoid any errors (as described above). Always keep in mind when coding looping statements that a condition must occur to terminate the loop, otherwise an infinite looping condition occurs. Incrementing the variable COUNT by 1 in this cases causes the value of COUNT to eventually reach 6 causing the loop to terminate. As with the if-then scenario described above, the while-do structure can be written as follows if desired:

	while [ "$COUNT" -le 5 ] ; do

Another looping statement worth of discussion here is the for statement. While many modern languages have for statements, the for statement in the shell is a little different. The general syntax is as follows:

	for VARIABLE in list
	do
	   statement
	   optional statement(s)
	done

Unlike the while statement, a for statement doesn't terminate based upon a condition, rather it terminates when all items in the list have been "processed." The concept of a list was first introduced with the discussion regarding special environment variables. A list is a grouping of data items, sometimes referred to as words, since they are separated by spaces. A list may have zero items, a few items, or a large number of items. A list (with more than one item) will have a beginning item (the leftmost) and an ending item (the rightmost). The items in the list can be strings, variables, files, etc.

When a for loop executes for the first time, the VARIABLE will be assigned the first (leftmost) item in the list. During each subsequent iteration of the loop, the variable will be assigned the next item in the list. Thus when all items in the list have been assigned to the variable, the loop terminates. Refer to the following example:

The variable name in this example is VALUE, and the list is composed of the 3 items (or words) "fred", "barney", and "dino." Thus when this loop executes, the following is the output:

	VALUE: fred
	VALUE: barney
	VALUE: dino

Notice there are 3 items in the list and the body of the loop executes 3 times. Thus for a list of size n, the body of a for loop will execute n times. Lists in for loops can be specified a number of ways, for example:

list	represents
item₁ item₂ ... item_n	a list of n items
$1 $2 $3	a list of 3 command line parameters
$*	a list of all command line parameters
*	a list of all files in current directory
a*	a list of all files starting with the letter a

For loops are typically used to process a quantity of things, for example, renaming all of the files in a directory by appending an extension of .backup (to indicate a backup copy) to their filename. As with the other scenarios described above, the for can be written as follows if one prefers:

	for VARIABLE in list ; do

Command Summary

test - checks the validity of enclosed expressions, or look here
conditional and iterative statements - if, for, while, etc.

File Operators	returns TRUE if
-d file	file exists and is a directory
-f file	file exists and is an ordinary file
-r file	file exists and is readable
-w file	file exists and is writable
-x file	file exists and is executable
-s file	file exists and its size is non-zero
Integer Operators	returns TRUE if
int₁ -eq int₂	int₁ is equal to int₂
int₁ -ne int₂	int₁ is not equal to int₂
int₁ -lt int₂	int₁ is less than int₂
int₁ -le int₂	int₁ is less than or equal to int₂
int₁ -gt int₂	int₁ is greater than int₂
int₁ -ge int₂	int₁ is greater than or equal to int₂
String Operators	returns TRUE if
string₁ = string₂	string₁ is equal to string₂
string₁ != string₂	string₁ is not equal to string₂
-z string	string is null (and must be seen)
Logical Operators	returns TRUE if
! expr	expr is FALSE, otherwise returns TRUE
expr₁ -o expr₂	expr₁ is TRUE OR expr₂ is TRUE
expr₁ -a expr₂	expr₁ is TRUE AND expr₂ is TRUE