Chapter 20

Shell Scripts - Procedures

Introduction

Here we concentrate on the facilities which allow large scripts to be broken down into smaller ones.

Calling another script

We saw earlier that scripts can do anything that can be done interactively. It is no surprise therefore, to see that one script can call another. That is what this script does:

$ more caller
echo calling another
another
echo back from another
$

The script it calls is named another; Here it is:

$ more another
echo another here
$

When we run caller we see this:

$ caller
calling another
another here
back from another
$

This is all as we would expect it to be if we are familiar with the idea of procedures or functions in programming.

Local variables and parameters

If we added some parameters and variables to our scripts they might look like this:

$ more caller
var=123
echo 'caller:  $#:'$#'  $1:'$1'  $var:'$var
another par1 par2
echo 'caller:  $#:'$#'  $1:'$1'  $var:'$var
$

and this:

$ more another
echo 'another: $#:'$#'  $1:'$1'  $var:'$var
var=456
echo 'another: $#:'$#'  $1:'$1'  $var:'$var

When we run them we get this output:

$ caller para pare park part
caller:  $#:4  $1:para  $var:123
another: $#:2  $1:par1  $var:
another: $#:2  $1:par1  $var:456
caller:  $#:4  $1:para  $var:123
$

Experienced programmers will appreciate that what we have here are local variables and parameters.

Both caller and another have their own private copies of the variables and parameters. There is, therefore, no chance of any mix-up between the two scripts. Clearly, caller has four arguments and another has two. Similarly, the variable var that another sets to 456 is a completely different variable to the one that caller sets to 123. Notice that var doesn't have any value at all when another first echos it.

Actually, if we reflect a little this situation is not so surprising. In chapter ??, we saw that Unix starts a new non-interactive shell to execute every script. So, when another was being executed, there were three shells active: the interactive shell, a non-interactive shell executing caller and a non-interactive shell executing another. The different sets of parameters and variables are maintained by different copies of shell. We will examine this in more detail later, when we look at recursion??.

Exported variables

If we wish to give some values to another script, the usual thing is to pass the values as arguments to the other script. However, there is another way to do so: we can allow the called script to use the values belonging to the calling script.

This new version of another does not give the variable a value; it simply displays the variable.

$ more another
echo 'another: $#:'$#'  $1:'$1'  $var:'$var
$

Having changed another, we must use a new version of caller as well:

$ more caller
export var
var=123
echo 'caller:  $#:'$#'  $1:'$1'  $var:'$var
another par1 par2
echo 'caller:  $#:'$#'  $1:'$1'  $var:'$var
$

The only change is that a new first line has been added. Now when we execute caller, we see this:

$ caller para pare park part
caller:  $#:4  $1:para  $var:123
another: $#:2  $1:par1  $var:123
caller:  $#:4  $1:para  $var:123
$

As you can see, this time, another was able to use the value of the variable that caller exported to it.

The reason is that every copy of shell maintains what is known as its environment; this is simply a list containing some of the variables along with their values. Variables in a shell's environment are inherited by any further shells that the shell starts. In that sense, variables in the environment are a bit like parameters in that they are set up before the script starts executing. The purpose of the export command in the last version of caller is to add var and its value to the environment.

Note that if another changed the value of the variable - like this:

$ more another
var=456
echo 'another: $#:'$#'  $1:'$1'  $var:'$var

when we executed caller, we would see this output:

$ caller para pare park part
caller:  $#:4  $1:para  $var:123
another: $#:2  $1:par1  $var:456
caller:  $#:4  $1:para  $var:123
$

The point is that another gets the value from caller but caller does not get the value back from another.

Returning results

We have just seen that a called script cannot give a result back to the calling script by using a variable. What we have to do, to return a result, is make the called script send its result to standard output. The caller can then access the value with the command substitution facility we saw in the previous chapter. This script has only one answer:

$ more silly
echo Thatcher
$

and here is how the caller can use that value:

$ more caller
echo silly gives a value of `silly`
$

and here is the output:

$ caller
silly gives a value of Thatcher
$

Of course, the value could be used in any Unix command - not just an echo:

$ more caller
answer=`silly`
if [ $answer = Thatcher ]
then echo It must have been a daft question
fi

or perhaps:

$ more caller
case `silly` in
Thatcher)  echo It must have been a daft question ;;
esac

Returning more than one word

If we need to return more than one word like this version of silly:

$ more silly
echo Ronald Raygun
$

then we can still use command substitution in the calling script to access the values as shown here:

$ more caller
set `silly`
second=$2
echo $second said nuke them
$ caller
Raygun said nuke them
$

The set puts the words returned by silly into as many parameters of caller as are needed for the number of words.

Giving a return code

A script can indicate an error using the exit statement. If it is followed by a number, that number will be given as the return code to the calling shell or script. If no number is supplied, the return code is taken to be zero. When an exit command is executed, the script immediately stops being executed; any commands after the exit are ignored. The script in this example shows a return code of 42 being returned:

$ more ret42
exit 42
this line is not executed as it is after the exit
$

If we execute it we can see

$ ret42
$ echo $?
42
$

that it works as expected.

Functions

It is a good idea to split any program into small units; in the case of scripts, the small units could be other scripts. This makes understanding the individual parts easier but makes maintenance and distribution harder because there are several small files instead of just one larger one.

Functions allow us to get the best of both worlds. Here is another version of caller which contains a version of another inside it:

$ more caller
another(){
  echo another here
  echo I am a function
  echo My parameters are $*
}

echo calling another
another Tom and Jerry
echo back from another
$

The first line tells the shell to remember the following lines under the name another. The next three lines are the commands that make up another. The line with the closing brace (}) marks the end of the function.The blank line visually separates the function from the main part of the script.

When we execute it we see this output:

$ caller
calling another
another here
I am a function
My parameters are Tom and Jerry
back from another
$

As you can see, functions can have parameters too. The version with the function behaves exactly like the one with a separate script.

One important point is that if a function is defined, it will be used in preference to commands or scripts with the same name. So in the example, if we had called the function caller or date, it would still have produced the same output.

Return code from a function

The exit command ends the execution of a script; this means it cannot be used inside a function. We have to use the return statement instead. Here is an example:

$ more ret
ret42(){
  return 42
  this line is not executed
}

ret42
echo return code is $?
$ ret
return code is 42
$

You can see that this example is logically the same as the version with a separate script.

Global variables

Here is an example which shows a simple menu driven system with three functions:

$ more menu
showmenu(){
  more <<-!
        1)  add one to counter
        2)  subtract one from counter
        3)  quit
        ?)  show these options
        !
}

inc(){
  counter=`expr $counter + 1`
  echo counter is now $counter
}

dec(){
  counter=`expr $counter - 1`
  echo counter is now $counter
}

counter=0
showmenu
while true
do   printf 'Option? '
     read option
     case $option in
          1)   inc ;;
          2)   dec ;;
          3)   exit ;;
          ?)   showmenu ;;
          *)   echo no such option ;;
     esac
done

The expr command puts the result of an arithmetic expression onto the standard output. We are using it to add or subtract one from counter.

The important functions are inc and dec; they both change the value of counter as this sample shows:

$ menu
1)  add one to counter
2)  subtract one from counter
3)  quit
?)  show these options
Option? 1
counter is now 1
Option? 1
counter is now 2
Option? 2
counter is now 1
Option? 3
$

Experienced programmers will recognise that counter is a global variable; its value can be changed by the main part of the script and by the functions.

Global variables again

There is another way of getting global variables: it uses the dot command we saw in chapter ??. The dot command tells the shell to read and execute the commands in another file. We can see the effect with these two scripts:

$ more glob2
global=0
. inc
echo global is now $global
$ more inc
global=1
$

Notice that the dot command has been used to execute inc.

If we run the script we see:

$ glob2
global is now 1
$

The reason it works is the same shell is executing both glob2 and inc. Normally, remember, another non-interactive shell is started to deal with each separate script; each shell maintains it own set of variables. When the same shell runs both scripts, only one set of variables is used and it is shared by the calling and called scripts.

Recursion

A script is recursive if it calls itself directly or indirectly. This script finds directories in the current directory and executes the pwd command inside them:

$ more dirs
for file in *
do   if test -d $file
     then cd $file
          pwd
          cd ..
     fi
done
$

If we create some directories and execute dirs we see this:

$ mkdir first second
$ mkdir first/1a first/1b second/2a
$ dirs
/homedir/cms/ps/book.unix/bin/first
/homedir/cms/ps/book.unix/bin/second
$

Notice that dirs did not did not do anything with the directories that were inside the first and second directories.

Now suppose that we want dirs to do the same thing in the directories that it found in first and second. At first sight it sounds difficult. However, if we think about it we should see that the job that has to be done in first and second is the same as the one that dirs does. All we have to do is arrange for dirs to be called after the pwd command. If we do that the script looks like this:

$ more dirs
for file in *
do   if test -d $file
     then cd $file
          pwd
          /homedir/cms/ps/book.unix/bin/dirs
          cd ..
     fi
done
$

The full path-name is needed because we haven't yet learned how to make Unix find scripts after it has cded to another directory.

If we execute dirs again we get this output:

$ dirs
/homedir/cms/ps/book.unix/bin/first
/homedir/cms/ps/book.unix/bin/first/1a
/homedir/cms/ps/book.unix/bin/first/1b
/homedir/cms/ps/book.unix/bin/second
/homedir/cms/ps/book.unix/bin/second/2a

Recursion is a very powerful and elegant facility. It certainly makes light work of what originally seemed a difficult problem.

A slight difficulty is that a non-interactive shell is needed for each concurrent execution of dirs. If we had seven levels of directory, we would need seven non-interactive shells at once. That is no problem to most systems but some might baulk with very high levels of nesting.

$HOME/bin

Now that we know enough about scripts, we can fix things so that Unix always knows where to find our scripts without us always using full path-names as in the previous example. We will create a directory called bin in our home directory and we will keep all our scripts in there. We will add our bin to the PATH.

Making the directory and putting our scripts there is easy:

$ mkdir $HOME/bin
$ mv dirs newcmd bu $HOME/bin
$

Of course, we should move all our scripts there to get the full benefit.

Adding our bin directory to the PATH could be done like this:

$ PATH=$PATH:$HOME/bin
$

However we would have to do it every time we logged on. Happily, in our home directory there is a shell script called .profile which is executed whenever we log on using the Bourne shell. If we put the command to modify the PATH in .profile, we will only have to type it once.

$HOME/.profile

If the system administrator has not given us a .profile script, we can create one containing just the PATH command. Otherwise we can simply add our command to the end of .profile.

Let's set up .profile now:

$ echo 'PATH=$PATH:$HOME/bin' >> $HOME/.profile
$ echo export PATH >> $HOME/.profile
$

The single quotation marks are needed to prevent PATH and $HOME being replaced with their values before the line is added to .profile.

When we have logged off and on again, we will be able to execute bu, or any other script we put into bin, from any directory:

$ . $HOME/.profile            # avoid logging off and on again
$ cd elsewhere
$ bu precious
precious backed-up in BU/precious
$ cd ..
$

We could also take the full path name from the last version of dirs.

Comments and blank lines

In scripts, comments begin with a hash (#) and finish at the end of the line. In addition, we can put blank lines in scripts to improve the readability. Here is newcmd with some comments and blank lines added:

$ more newcmd
#!/bin/sh

# newcmd - display date, working directory and list of files

date
pwd
ls      # display a list of files
$

There are three kinds of comments in this version. We will look at them in reverse order. The last one is a `rest of line' comment. The actual text in the comment should say something useful though - unlike the example which does not tell us anything new. The middle comment takes a whole line and shows the name of the script and what it does. The first line is a special comment which tells Unix which shell should be used to execute the script. If it had said: #!/bin/csh, the C-shell would have been used. Its always a good idea to put one of these comments in your scripts so that you don`t have to change them all if you begin to use a different shell later on.

QUESTIONS

  1. Create a shell script called cdv which will cd to the /var directory and do a pwd. Does it leave you in /var if you run it like this?:

    $ cdv
    /var
    $ pwd
    ???
    

    Why not? How should you execute it? If the answer isn't obvious, leave it for now, and think again after the next question.

    Answer

    Here is the cdv script:

    $ more /homedir/cms/ps/bin/cdv
    cd /var
    pwd
    $
    

    It doesn't work fully, as you can see, if you run it and check:

    $ pwd
    /homedir/cms/ps
    $ cdv
    /var
    $ pwd
    /homedir/cms/ps
    $
    

    It appears to run, and clearly visits the /var directory, but it leaves you in your original directory. This is because another (non-interactive) shell executes it. Your (interactive) shell simply waits for that shell to finish. You have to type:

    $ . cdv
    $ pwd
    /var
    $
    

    to make your shell read and execute the script. (Note the dot and space before cdv.)

    Hide

  2. Type a function definition directly into an interactive shell. The function should be called cdvFun and should do the same as cdv in question one. Does it work? Can you explain the difference now?

    Answer

    This defines the function for your current (interactive) shell:

    $ cdvFun(){
    >   cd /var
    >   pwd
    > }
    $
    

    (Remember that > is the shell's continuation prompt -- nothing to do with input/output redirection.)

    The function now runs when you simply type its name into the same shell:

    $ cdvFun
    /var
    $ pwd
    /var
    $
    

    And this time it works properly because your (interactive) shell executes the function, without starting another shell.

    Hide

ANSWERS

  1. Here is the cdv script:

    $ more /homedir/cms/ps/bin/cdv
    cd /var
    pwd
    $
    

    It doesn't work fully, as you can see, if you run it and check:

    $ pwd
    /homedir/cms/ps
    $ cdv
    /var
    $ pwd
    /homedir/cms/ps
    $
    

    It appears to run, and clearly visits the /var directory, but it leaves you in your original directory. This is because another (non-interactive) shell executes it. Your (interactive) shell simply waits for that shell to finish. You have to type:

    $ . cdv
    $ pwd
    /var
    $
    

    to make your shell read and execute the script. (Note the dot and space before cdv.)

  2. This defines the function for your current (interactive) shell:

    $ cdvFun(){
    >   cd /var
    >   pwd
    > }
    $
    

    (Remember that > is the shell's continuation prompt -- nothing to do with input/output redirection.)

    The function now runs when you simply type its name into the same shell:

    $ cdvFun
    /var
    $ pwd
    /var
    $
    

    And this time it works properly because your (interactive) shell executes the function, without starting another shell.