Chapter 19

Shell Scripts - Quoting, Here Documents and Command Substitution

Introduction

Here we look at different means of quoting text in scripts and at a couple of specialised input-output redirection facilities that are only relevant in scripts.

Strong (') and weak (") quotes

So far, we have used single quotation marks (') around text that we echoed when we wanted to preserve extra spaces between words. For example:

$ echo 'o           o'
o           o
$

That does not always work though, as this example shows:

$ echo 'o     $variable      o'
o     $variable      o
$

The problem is that the shell does not substitute the value of parameters and variables inside single quotation marks because they take away the special meaning of the dollar sign. Fortunately, we can use double quotation marks (") instead:

$ variable=x
$ echo "o     $variable      o"
o     x      o
$

Double quotation marks preserve the extra spaces but do not hide variables or parameters.

If some dollar signs are to be quoted and some are not, we can use double quotes and backslashes (\) together:

$ echo "\$     $variable      \$"
$     x      $
$

Sometimes we have to mix both kinds of quotes as in:

$ echo "''"' are stronger than ""'
'' are stronger than ""
$

We could not use:

echo '\'\' are stronger than ""'

because single quotes even hide backslashes.

Here documents

This script displays a file called laws.data:

$ more laws
more laws.data
$

It's only a little script but there are three problems. First, we can't see exactly what it does without looking at the laws.data file. Second, we have two files to maintain. Third, I had to think of two filenames instead of one!

What we want is to have the text as part of laws. Shell lets us do that like this:

$ more laws
more <<finis
The Four Laws of Ecology:

(1)  Everything must go somewhere.
(2)  There is no such thing as a free lunch.
(3)  Everything is connected with something else.
(4)  Nature knows best.
finis
$ laws
The Four Laws of Ecology:

(1)  Everything must go somewhere.
(2)  There is no such thing as a free lunch.
(3)  Everything is connected with something else.
(4)  Nature knows best.
$

What happens is that the text that is right here in the script is passed as the standard input to more. The line containing finis marks the end of the input; it is not passed to more.

This kind of input redirection is called a here document. The symbol for here documents is << and is followed by a delimiter. In our example, the delimiter is finis but Unix doesn't really speak French - you can use almost any delimiter as long as you spell it the same both times!

The laws script would be clearer if we set the text in the here document back from the left hand margin. That would separate the text visually from the command which processes it. This is known as indentation.

If we indent the lines of the here document with tab characters, shell can strip them off before the data is passed to more. We ask for the stripping by putting a minus sign (-) after the redirection symbol like this:

$ more laws
more <<-!
        The Four Laws of Ecology:
        (1)  Everything must go somewhere.
        (2)  There is no such thing as a free lunch.
        (3)  Everything is connected with something else.
        (4)  Nature knows best.
        !
$ laws
The Four Laws of Ecology:
(1)  Everything must go somewhere.
(2)  There is no such thing as a free lunch.
(3)  Everything is connected with something else.
(4)  Nature knows best.
$

Notice that this time, we have chosen to use a different delimiter for the here document.

A more useful example

Every programmer who has a terminal within reach of their telephone has a shell script called something like tel. Here is a simplified example:

$ more tel
grep "^$1" <<-END
        catherine 1234 567 890
        marilyn 2345 678 901
        susan 3456 789 012
        christine 4567 890 123
        END
$ tel ca
catherine 1234 567 890
$

As you can see, it uses grep to search for a name in the list of names and telephone numbers. The list is held inside the script as a here document, making tel more maintainable and portable than it would be if we kept the list in a separate file.

Quoting here documents

Here documents are often used for giving commands to programs that are usually used interactively. For example, suppose we have a file containing good things this:

$ more goodthings
good company
wine
laughter
plague and famine
ERRRGHHH well no actually
cycling
mountains
$

Unfortunately, some bad things have got into it; fortunately each of the mistakes is followed by a line which indicates it! We wish to delete the lines and their indicators. We can't easily use sed because we have to go back up the file. The best editor for the job, therefore, is ed. The here document in the following script contains the ed commands we need to edit the file.

$ more quotes
ed goodthings <<-\! > /dev/null
        g/^ERRRGHHH.*actually$/-\
                .,.+1d
        w
!
$

Clearly the script would not work if the shell tried to substitute a parameter after the dollar sign in the regular expression. We prevent that by putting a backslash in front of the here document delimiter. Notice that ed's output has been ignored by sending it to /dev/null.

Command substitution

Here documents are a kind of input redirection. There is an output redirection facility that is designed for use in scripts. Here is a very simple example:

$ echo "It is `date` precisely"
It is Fri Jul  8 14:39:51 BST 1994 precisely
$

The new facility is called command substitution; it is called into play by the backward sloping single quotation marks (`). When Unix finds them in a command, it executes the command inside the quotes and captures the standard output which is then inserted into the outer command in place of the quoted command. In our example, Unix executes the quoted date command and puts date's output into the string which is then echoed.

The commands which appear inside the command substitution facility can be as complex as you like. For example, they can be spread over several lines and can have their own input/output redirection.

Command substitution in Bash

In the bash shell a more flexible syntax is available for command substitution. The above example would appear thus:

$ echo "It is $(date) precisely"
It is Sun Jun 30 14:30:48 BST 2013 precisely
$

As you see, the leading backward sloping quotation mark (`) has been replaced with a dollar sign followed immediately by an opening parenthesis ($() and the trailing ` has been replaced with a closing parenthesis.

This syntax is neater because it can be nested. That is, you can have command substitution inside command substitution. It is also sometimes easier to read depending on the font being used to display it. However, it does take up more space on the command line so I rarely use it.

Set with arguments

It is possible to discard all a script's parameters and set up new ones. This is done with the set command. In this example:

$ set Rimmer Lister Cat Kryten
$ echo $*
Rimmer Lister Cat Kryten
$

any existing parameters are lost and the Red Dwarf crew is put into parameters one to four.

Command substitution and set are often used together. This example shows how we can extract some of the fields from the date command:

$ set `date`
$ echo $2 $3 $6
Jul 8 1994
$

Be careful that any command used with set and command substitution does actually produce some output. If it gives none, it is as if you had executed set with no arguments. Since this simply displays all your variables, it can be embarrassing if it happens in the middle of a script! For instance, this command gives unexpected results in an empty directory:

set `ls`

Sometimes, the simplest way round this problem is to supply a dummy value so that set always sees something:

set `ls` dummy

Redirection in scripts

In a script the individual commands may have their standard input or output redirected. Any input or output that hasn't been redirected inside the script, is free to be redirected when the script is called. This script is a modification of an earlier one:

$ more newcmd
date
pwd
ls > file1
$

The output from ls is "tied" to file1. The output from date and pwd is "free". If we call the script and redirect its output:

$ newcmd > file2
$

only the output from date and pwd will be written to file2.

Redirecting output to the same file

If we wish to send the output from two of the commands in our script to the same file, we could do this:

$ more newcmd
date
pwd > file1
ls >> file1
$

So that the second (or later) commands append to the file. However, there is a tidier method:

$ more newcmd
date
{ pwd
  ls
} > file1
$

The braces ({ and }) are used to group commands together so that their output is treated as it were all from one command.

Redirecting from loops

The output from while and for loops can also be redirected in a similar manner. Here is a rather artificial example to demonstrate the facility:

$ more redir
for number in 2 4 6 8
do   echo $number
done |
  while read even
  do   echo $even is even
  done
$ redir
2 is even
4 is even
6 is even
8 is even
$

As you see the script, uses the for loop to generate the first four even numbers. The redirection after the done causes the output to be sent via a pipe to the while loop where all the lines are read and the numbers are echoed with an explanation. The backslash after the pipe symbol (|)allows us to put the command after the pipe on the following line which looks neater.

You can do this kind of redirection with all of shell's control constructs.

Preventing redirection

Some-times we want to ensure that read and echo use the terminal even though the rest of the input/output may have been redirected. To do that we have to "tie" the input/output to /dev/tty. This script sends its input to its output, one line at a time; the user has to respond to a prompt before each line is transmitted:

$ more dripfeed
while read line
do   echo Hit return to send a line > /dev/tty
     read dummy < /dev/tty
     echo $line
done
$

If we use it on a file with three lines, the output looks like this:

$ dripfeed < threelinefile > there
Hit return to send a line

Hit return to send a line

Hit return to send a line

$

The blank lines result from the user hitting the return key when prompted.

QUESTIONS

  1. Can you explain the effect of the backslashes (\) in the following?

    $ poet=john
    $ echo $poet
    john
    $ echo \$poet
    $poet
    $ echo\ $poet
    echo john: not found
    $
    

    Answer

    Backslashes take away the special meaning of the following character, so the first one causes the $ to be echoed instead of the shell interpreting it as, "display the value of a variable". The second one makes the shell regard the space as part of a non-existent echo john command, rather than being the special character that separates arguments.

    Hide

  2. From this question onwards, you have to write a shell script. Name them "q14.2" ...

    Without using echo or printf, write a script that displays a limerick.

    Answer

    $ more q14.2
    cat <<-!
            There was a young man from Kent
            ...
            ...
            ...
            and instead of coming he went!
    !
    $ q14.2
    There was a young man from Kent
    ...
    ...
    ...
    and instead of coming he went!
    $
    

    Don't forget that the indentation is done with tab characters. And, while writing the script, I used an editor that did not turn tabs in to spaces.

    Hide

  3. You should have a script (q9.4) that displays the number of directories in the current directory. Use it to display:

    There are NN directories in this directory
    

    where NN is the number.

    Answer

    $ more q14.3
    echo There are `q9.4` directories in this directory
    $ q14.3
    There are 42 directories in this directory
    $
    

    The quotation marks are those (backward sloping) that cause command substitution.

    Hide

  4. Write a shell script to display the date in the form: Aug 28. Do not bother to read the man page for date.

    Answer

    $ more q14.4
    set `date`
    echo $2 $3
    $ date
    Sat Jul 11 13:23:51 BST 2009
    $ q14.4
    Jul 11
    $
    

    As the question hinted, the date command has options that display dates in almost any format. However, the technique demonstrated is more generally useful.

    Hide

  5. Write a shell script to display the date in such a way that the output always appears on the terminal no matter what redirection has been applied by the user.

    Answer

    $ more q14.5
    date > /dev/tty
    $ q14.5 > /dev/null
    Sat Jul 11 13:24:24 BST 2009
    $
    

    Hide

  6. Write a script to look in a specified file for lines ending with a specified letter. Allow for the possibility of spaces after the letter at the end of the lines.

    Hint: grep 'a$' -- finds lines ending in a.

    Answer

    $ more q14.6
    grep "$1 *\$" $2
    $ q14.6 t cars
    year to his car.  He sits in it while it goes and while it
    $
    

    Double quotation marks must be used to allow $1 to be interpreted. The backslash before the second dollar sign prevents the shell from trying to interpret it as a variable.

    Hide

ANSWERS

  1. Backslashes take away the special meaning of the following character, so the first one causes the $ to be echoed instead of the shell interpreting it as, "display the value of a variable". The second one makes the shell regard the space as part of a non-existent echo john command, rather than being the special character that separates arguments.

  2. $ more q14.2
    cat <<-!
            There was a young man from Kent
            ...
            ...
            ...
            and instead of coming he went!
    !
    $ q14.2
    There was a young man from Kent
    ...
    ...
    ...
    and instead of coming he went!
    $
    

    Don't forget that the indentation is done with tab characters. And, while writing the script, I used an editor that did not turn tabs in to spaces.

  3. $ more q14.3
    echo There are `q9.4` directories in this directory
    $ q14.3
    There are 42 directories in this directory
    $
    

    The quotation marks are those (backward sloping) that cause command substitution.

  4. $ more q14.4
    set `date`
    echo $2 $3
    $ date
    Sat Jul 11 13:23:51 BST 2009
    $ q14.4
    Jul 11
    $
    

    As the question hinted, the date command has options that display dates in almost any format. However, the technique demonstrated is more generally useful.

  5. $ more q14.5
    date > /dev/tty
    $ q14.5 > /dev/null
    Sat Jul 11 13:24:24 BST 2009
    $
    
  6. $ more q14.6
    grep "$1 *\$" $2
    $ q14.6 t cars
    year to his car.  He sits in it while it goes and while it
    $
    

    Double quotation marks must be used to allow $1 to be interpreted. The backslash before the second dollar sign prevents the shell from trying to interpret it as a variable.