Chapter 10

Line editors - ed and sed

Introduction

Now that we know about regular expressions, we can learn a little about Unix's line editors ed and sed. There are about 20 commands in ed but we will only learn a few most useful ones. We can get by with so little knowledge because we will not be using ed directly.

Line editor

A line editor is a program that allows us to change a file one line at a time; we don't normally see the file or even the line we are changing while editing it. The alternative is a full-screen editor which shows us a screenful of the file while we are making changes. It may seem odd to learn about line editors when full-screen editors are available. However, the line editors and their commands are very powerful; they will be needed when we build our own tools and commands. Also, one of the most important features of the full-screen editor vi is that it allows us to use ed's powerful commands.

Starting ed

To edit an existing file, we use:

$ ed cars
726
1
The typical American male devotes more than 1,600 hours a

The 726 is the number of characters in the cars file; it was displayed by ed. The second number was typed by me; it is the command which makes the first line the current line. Mostly, ed does commands without displaying anything but when it moves to a new line it does display it. Notice that ed does not display a prompt; newcomers find that unsettling. Another peculiarity of ed is that it has only one error message -- which consists of just a question mark!

Substitute command

The most used command in ed is the substitute command:

s/devote/waste/

It consists of the letter s followed by a character known as the delimiter; then comes a string of characters which is to be removed from the line, followed by the delimiter; lastly comes a string of characters which is to replace what was removed, followed again by the delimiter. For now, will call the two strings respectively the removal and replacement strings.

We can now see that our example takes devote out of the line and puts waste in its place. Notice that ed does not waste our time by displaying the new version of the line; it simply waits for us to enter the next command. If we want reassurance, the p command makes ed display the content of the current line:

p
The typical American male wastes more than 1,600 hours a

and we can check we got our command right.

The reason that the delimiter has to be typed three times is that the first one announces which character will be used in this particular substitute command. That means, we could use a different delimiter each time we perform a substitution. In practice, most people use slash (/) like we have done unless it occurs in one of the strings. When they don't use slash, they often use a question mark. Like this:

s?devote?waste?

Here is a more realistic example:

s?/?/../?

In this case, we had to use another delimiter to avoid confusion with the slashes in the removal string (/) and the replacement string (/../).

Errors

If the string of characters we have asked to be removed does not occur in the current line, ed will display its error message:

s/demote/waste/
?

It will not try to find the string on some other line.

Substitute suffix - p

We can make ed display the new version of the line after a substitution by adding a p to the end of the command. Characters tacked onto the substitute commands are called suffixes. Here is another substitute command:

sXmaleXmanXp
The typical American man wastes more than 1,600 hours a

It has an outlandish delimiter (X) and has the p suffix to make ed display the latest version of the line.

Substitute suffix - g

The basic form of the substitute command only makes one change to the line. If we want to change every occurrence in the current line, we need a g suffix, as shown here:

s/an/ANO/gp
The typical AmericANO mANO wastes more thANO 1,600 hours a

It is OK to use two suffixes but they must be in the order shown. Notice that only the current line is changed; all the others are left alone.

A very important point is that the replacement string is inserted only once for each substitution. If the g suffix is used, ed tries to do more substitutions from the point it has reached in the current line. In the example, we have ANO three times because ed did substitutions for all three occurrences of an.

Regular expressions

The substitute command's "removal string" is actually a regular expression (RE) as this example shows:

s/ANO.*ANO/rhubarb/p
The typical Americrhubarb 1,600 hours a

The way substitutions are done is that ed works out which characters are matched by the RE; it then chops those characters out of the line and puts the replacement string in their place. Because REs always match as many characters as possible, our example takes ANO mANO wastes more thANO out of the line and puts rhubarb in its place.

The next command simply makes the line sensible again:

s/rhubarb/an man wastes more than/p
The typical American man wastes more than 1,600 hours a

Because we can use REs, ed does not need separate commands to insert text at the ends of lines or to split a line in two. The substitute command is used for all three jobs. For instance:

s/^/Fact: /p
Fact: The typical American man wastes more than 1,600 hours a

inserts text at the start of a line, and:

s/$/t/p
Fact: The typical American man wastes more than 1,600 hours at

sticks text to the end of the line.

Splitting a line

This command would change the first space character on a line into a new-line:

s/ /\
/

Notice that the command has spilled over onto two lines. The backslash (\) immediately before the end of the first line tells ed that the new-line is part of the replacement string. Without the backslash, ed would take the new-line as the end of the command. It would assume the command was intended to be:

s/ //

which would simply chop the first space from the line.

If we provide some context, we can change a specific space into a new-line. The following command affects the space between n and m.

s/n m/n\
m/p
man devotes more than 1,600 hours at
-
Fact: The typical American
u
p
Fact: The typical American man devotes more than 1,600 hours at

Notice the use of the - command to go to the line above the current line, and the use of the u command to undo the effect of the previous change.

Missing replacement string

If no replacement string is supplied the characters matching the RE are simply removed -- as shown here:

s/......//p
The typical American man wastes more than 1,600 hours at
s/.$//p
The typical American man wastes more than 1,600 hours a

Notice that when it is finding a match for a RE, ed starts looking at the left-hand end of the line. That is why the string of dots matched the first six characters on the line. In the second command, the dollar sign after the single dot made it match the last character on the line.

& in the replacement string

An ampersand (&) in the replacement string has a special meaning. Let's see it in action before looking at the details:

s/man/(&)/p
The typical American (man) wastes more than 1,600 hours a

The ampersand means: whatever the RE matched in the most recent match. In the example that was man, so the overall effect is to enclose the word in parentheses. If needed, more than one ampersand can be used, perhaps to duplicate the matched text.

Accessing parts of a string

We can enclose parts of a complex regular expression in escaped parentheses (\() and (\)). Doing so, lets us split the complex regular expression into smaller parts and refer to the smaller parts individually. For example:

s/\(.*American\).*\(wastes.*\)/\1 person \2/p
The typical American person wastes more than 1,600 hours a

Here the regular expression matched the whole line. The match was done in three parts. The first part was up to and including American. The last was from wastes to the end of the line. The middle part was all the characters between the first and last parts, that is (man) (and the surrounding space characters).

In the replacement string, \1 means the text matched by the RE in the first set of escaped parentheses. Obviously, \2 means the text matched by the RE in the second set. This means that our line of text is replaced with the first and last parts with person (and spaces) sandwiched between. Notice that we didn't bother to wrap the middle part in escaped parentheses because we had no need to refer to it later.

This facility is very complex but amazingly useful when building tools to alter complex text. If the need arises, we can refer to up to nine parts of a RE.

Commands so far

This table:

Command | Function
--------+-------------------------
1       | go to line one
s       | substitute
p       | display the current line
-       | go back (up) one line
u       | undo the last change

shows the commands we have already used.

More commands

When ed was written editors from big computer companies often had as many as 200 commands. Because so much can be done with the substitute command and regular expressions, ed needs a mere 20 commands. We will only learn five more of them. You can read ed's man page if you need to find out about the rest.

Our five new commands are:

Command | Function
--------+-------------------------
d       | delete the current line
w       | write changes to disk
q       | quit
g       | global
v       | inverse global

Here is the delete command in action:

p
The typical American person wastes more than 1,600 hours a
d
p
year to his car.  He sits in it while it goes and while it
u
p
The typical American person wastes more than 1,600 hours a

Notice the delete command causes no output, and that the line after the deleted one becomes the current line. Also, when the deletion is undone, the replaced line becomes the current line.

Searching

We can look for a line containing a particular word by using a command like this:

/spends/
tickets.  He spends four of his sixteen waking hours on the

The search starts at the current line; it continues towards the last line of the file and wraps round from the last line to the first one if needed. Is is possible to search from the current line towards line one if needed:

?petrol?
He works to pay for petrol, tolls, insurance, taxes and

This search also wraps round.

Context addresses

These ways of referring to lines:

/spends/
?petrol?

are known as context addresses; they can be used as if they were line numbers. Notice that we can use REs in context addresses; we aren't limited to fixed text. Note, however, that the delimiters are fixed.

Missing RE

If we do not give ed anything where it is expecting a RE it re-uses the last RE we did give. This is useful because it saves us typing -- as shown

/[A-Z][a-z][a-z]*/
He works to pay for petrol, tolls, insurance, taxes and
//
tickets.  He spends four of his sixteen waking hours on the
/
road or gathering resources for it.  The model American puts

The same thing applies to the substitute command too:

s//A/p
road or gathering resources for it.  A model American puts

In all the above commands ed re-used the RE (which represents a word starting with a capital letter.) The three searches found: He, He (again) and The. The substitute replaced The with A.

Notice how ed is engineered to make life easier. We don't even need to type the slash at the end of a missing RE -- as demonstrated in the last of our searches.

Line numbers

If we put a line number in front of an ed command it will affect only the appropriate line of the file. We can also specify a range of lines by giving two line numbers separated by a comma. So this:

11p

would display line 11, and this:

2,6p

would display lines two through to six. If we had used d instead of p, we would have deleted five lines.

This example shows line numbers in front of the substitute command:

2s/car/automobile/p
year to his automobile.  He sits in it while it goes and while it
1,4s/ it / his car /gp
money to put down on his car and to meet the monthly instalments.

Notice that ed makes the last line specified the new current line, and only displays that line, although it may have changed all four lines.

Special line numbers

This table shows some special line numbers and their meanings:


Symbol | Meaning
-------+----------------------------
1      | the first line
$      | the last line
.      | the current line
0      | the line before the first!

If a line number is left out, ed assumes the current line is referred to. Therefore the following ranges can be used:


Range           | Meaning
----------------+----------------------------
1,$             | all lines
,$              | from the current line to the end of the file
1,              | from line one to the current line
.-2,.+2         | the five lines centred on the current line
/parks/,/four/  | from the line containing parks
                | to the line containing four

For example, this command would display all of the file:

1,$p

But, we needn't see the output!

Global command - g

The trouble with trying to change all lines in a file by using 1,$ is that it will fail if the change cannot be made on at least one line. To solve this problem we can use the global command; it has this format:

g/RE/commands

where commands represents any ed commands we wish to perform on the lines that match the RE. Here is an example:

g/car/p
year to his automobile.  He sits in his car while his car goes and while it
stands idling.  He parks his car and searches for it.  He earns the
money to put down on his car and to meet the monthly instalments

In the example, commands has been replaced with the p command which displays the line. The net effect of our global command therefore is to display all lines containing car.

Notice the format of the last command. It was g/RE/p, or, ignoring the slashes, gREp. Efficiently performing that command is where grep gets its name.

Here is another g command:

g/automobile/s//car/g

Here the g command is controlling an s command. See how we didn't have to type the RE again; ed re-used the previous one which in this case was automobile. The advantage of using the global command is that it can safely be used to change all lines. There is no problem if none of them contain automobile.

Do not confuse the g at the start of the line with the one at the end. The first one means: all lines containing the RE; the second means: all possible changes on the line.

If we wish to see the changed lines we can use a command like this:

g/his car/s//it/gp
year to it.  He sits in it while it goes and while it
stands idling.  He parks it and searches for it.  He earns the
money to put down on it and to meet the monthly instalments.

It is the p suffix on the substitute command that makes the difference.

Often, we wish to delete certain lines. Here is how it is done:

g/ it /d

We used the d command to delete the lines containing the RE.

The inverse of g - v

The v command is the inverse of g in that the commands are performed on lines that do not match the RE. This command would delete the lines that the previous command did not delete:

v/ it /d

Since all lines have now been deleted, we must take care not to let ed write the changes back to disk. The w command would do that. Instead, all we have to do is quit, using the q command:

q
?
q
$

Notice, we have to repeat the command to confirm that we are aware that we have altered the file but not written it to disk.

Stream editor - sed

The editor ed was designed for interactive use. Typically, it moves backwards and forwards from line to line in a file randomly, making changes as directed by the user. The other line editor, sed, is a stream editor. It always starts at the first line and works through, line by line, towards the end of the text. The changed text is never put back into a file by sed; it simply puts the new version of the text onto the standard output. It is not restricted to working on text in a file -- it can edit its standard input too. It is more efficient than ed for non-interactive edits and can handle larger files.

Apart from those differences in how the two editors operate, there is an important difference between ed and sed commands. The ed commands are done only on the current line, unless we specify otherwise. The sed commands are done on every line, unless we specify otherwise.

For example, this ed command:

s/He/She/

would only affect the current line. When we use it with sed it is attempted on every line, as shown:

$ sed 's/He/She/' cars
The typical American male devotes more than 1,600 hours a
year to his car.  She sits in it while it goes and while it
stands idling.  She parks it and searches for it.  He earns the
money to put down on it and to meet the monthly instalments.
She works to pay for petrol, tolls, insurance, taxes and
tickets.  She spends four of his sixteen waking hours on the
road or gathering resources for it.  The model American puts
in 1,600 hours to get 7,500 miles: less than five miles per
hour.  In countries deprived of a transportation industry,
people manage to do the same, walking wherever they want to
go, and they allocate only three to eight percent of their
society's time budget to traffic instead of 28 per cent.

Ivan Illich
$

As you can see, the first of sed's arguments is the editing command; the second is a file name. The quotation marks around the editing command were, strictly speaking, not necessary. When we looked at grep, we saw that it is usually a good idea to wrap REs in quotation marks so that the shell does not interpret the metacharacters and spaces. The same reasoning applies to sed's commands.

A very important point about the previous example is that cars is just an input file. Its contents were not altered. The edited text was only sent to the standard output, not put back into the file.

If we need to save the new version of the text, we have to do this:

$ sed 's/He/She/' cars > newcars
$

which redirects the output into a file.

Just as with ed, we can use line numbers or context addresses. For example:

$ sed '2s/He/She/' cars > newcars
$ sed '2,6s/He/She/' cars > newcars
$ sed '/typical/,/parks/s/He/She/' cars > newcars
$

Multiple commands

So far, we have only done one change at once with sed. Usually however, we wish to do several. Here is how it is done:

$ sed -e 's/car/auto/g' -e 's/petrol/gas/g' cars > newcars
$

The difference is that each editing command is preceded by the -e option, otherwise sed assumes the second and subsequent editing commands are file names.

For clarity, we could split a long command over several lines, with each editing command on its own line:

$ sed -e 's/car/auto/g' \
>     -e 's/petrol/gas/g' \
>     -e 's/tickets/fines/g' cars > newcars
$

Don't be confused by the >; it is the secondary prompt, used by the shell to remind the user that the command has spilled over onto another line. It was typed by Unix and not by me.

We could achieve the same effect like this:

$ sed 's/car/auto/g
>      s/petrol/gas/g
>      s/tickets/fines/g' cars > newcars
$

This last method, without the -e options and with only one set of quotation marks, is by far the neatest. Notice how, in the previous two examples, the editing commands have been lined up with each other to enhance readability.

Most common usage

The the next example shows probably the most common way of using sed.

$ sed '/Man/d
>      /car/s//auto/g
>      /petrol/s//gas/g
>      /tickets/s//fines/g' cars > newcars
$

All the commands look just like ed's g command except they don't start with g! And they work exactly the same, executing the command after the initial RE only on the lines that match the RE. The empty RE in the substitute commands causes the previous RE (which is car in the first substitute command) to be re-used. The big advantage of this variation is that all the editing commands begin with the RE, making the whole thing easier to read.

Quiet operation - -n

Usually, sed writes every line to the standard output after making any changes. The -n option and sed's p command allow us to display only certain lines of a file. This example displays line four:

$ sed -n '4p' cars
money to put down on it and to meet the monthly instalments.
$

And this example makes sed do the same as grep:

$ sed -n '/model/p' cars
road or gathering resources for it.  The model American puts
$

That is, it displays the lines matching the RE (model).

Standard input

So far, we have only seen sed operating on a file. This example is different because if shows sed working on its standard input:

$ date | sed 's/:/ /g' | wc -w
       8
$

Here, the standard input comes via a pipe from the date command; sed changes the colons in the time to spaces and then wc counts the words in sed's output.

This may be a contrived example but it demonstrates a very common way of using sed.

sed: command garbled

The following command looks perfectly straight forward.

sed 's/model/typical/ ' cars

Surely it just changes model in its input to typical? In fact when we run it, we get the following:

sed: command garbled: s/model/typical/

The problem is that sed is very intolerant of extra spaces, and there is one at the end of the substitute command, just before the final quotation mark. To make matters worse, the extra space is invisible in sed's error message. Also, the error message is usually all sed has to say when you get an editing command wrong.

QUESTIONS

In these questions, you have to work out the effect of the given substitute command on this line of text:

Smith, A.B.

Each substitution is to be done on the original line, not on the result of the previous substitution.

  1. s/ith/ythe/
    

    Answer

    Smythe, A.B.
    

    Hide

  2. s/./-/
    

    Answer

    -mith, A.B.
    

    Remember . matches any character. Because matching starts on the left it actually matches our S so that is replaced with a single -.

    Hide

  3. s/./-/g
    

    Answer

    -----------
    

    Because of the g suffix, ed makes eleven substitutions:

    Substitution | Match | Replacement
    -------------+-------+------------
               1 |   S   | -
               2 |   m   | -
               3 |   i   | -
    etc
              11 |   .   | -
    

    That is why the replacement string appears eleven times.

    Hide

  4. s/.*/-/
    

    Answer

    -
    

    The RE (zero or more of any character) matches Smith, A.B. which is chopped out and replaced with a single -.

    Hide

  5. s/,/./
    

    Answer

    Smith. A.B.
    

    A dot in the replacement has no special meaning!

    Hide

  6. s/\./,/g
    

    Answer

    Smith, A,B,
    

    The backslash stops the dot being a metacharacter so it simply matches a literal dot. The g suffix causes two substitutions:

    Substitution | Match     | Replacement
    -------------+-----------+------------
               1 | . after A | ,
               2 | . after B | ,
    

    Hide

  7. s/m.t/mot/
    

    Answer

    Smoth, A.B.
    

    The dot matches any character but the m and t provide a context, making it match the o.

    Hide

  8. s?m.t?mot?
    

    Answer

    Smoth, A.B.
    

    Same command apart from the delimiters

    Hide

  9. s/^/Jean /
    

    Answer

    Jean Smith, A.B.
    

    Hide

  10. s/$/ (Ms)/
    

    Answer

    Smith, A.B. (Ms)
    

    Hide

  11. s/[mi]/X/
    

    Answer

    SXith, A.B.
    

    The RE matches the m which is the first character (from the left) that is an m or an i. The RE matches just a single character and that is replaced with the replacement string (X).

    Hide

  12. s/[im]/X/
    

    Answer

    SXith, A.B.
    

    The RE again matches the m because it is the first character that is an i or an m.

    Hide

  13. s/[mi][mi]*/X/
    

    Answer

    SXth, A.B.
    

    The first [im] matches the m. The [im]* matches zero or more is or ms. Because the m has been matched, the first possibility for matching is the i. The next possibility is the t but that fails. So, the whole RE matches mi which is chopped out and replaced with the replacement string (X).

    Hide

  14. s/[mi]*/X/
    

    Answer

    XSmith, A.B.
    

    This is a catch question that demonstrates a common mistake. The RE matches zero or more ms or is. However, there are none of them before the S so nothing is removed from before the S and the replacement string is inserted at that place.

    Think of this a being like the "no elephants" that can be found in every classroom. In general, REs ending with an asterisk have to follow, or be followed by some context to anchor them somewhere in the line. The first [mi] in the previous question anchored the rest of the RE after the m.

    Hide

  15. s/[A-Z]/X/
    

    Answer

    Xmith, A.B.
    

    An RE of this form matches just a single character.

    Hide

  16. s/[a-z]/X/
    

    Answer

    SXith, A.B.
    

    Hide

  17. s/[a-z][a-z]*/X/
    

    Answer

    SX, A.B.
    

    Another way of looking at this one is it consists of an RE that matches one character plus an RE that matches zero or more characters. The result is an RE that matches one or more characters.

    Generally, this RE matches a lower case word.

    Hide

  18. s/[A-Z]/X/g
    

    Answer

    Xmith, X.X.
    

    An earlier question with a suffix.

    Hide

  19. s/[^AB]/X/g
    

    Answer

    XXXXXXXAXBX
    

    The RE represents a single character that is not an A or a B. The substitution is done eight more times because of the suffix.

    Hide

  20. s/[^,]*/X/
    

    Answer

    X, A.B.
    

    Because there is no context, the RE match begins before the S and continues to the h which is the last character before the comma. The matched character string is removed and replaced with a single X.

    Hide

  21. s/^[^,]*, .\./X/
    

    Answer

    XB.
    

    This is the previous RE again but to the right we have added: a comma, a space, any single character and a literal dot. To the left, the first caret (^) anchors the matching to the start of the line. Since that is the default, this extra caret makes no difference to the result. The longer, matched text is again replaced with replacement string.

    Hide

  22. s/.*/& (&)/
    

    Answer

    Smith, A.B. (Smith, A.B.)
    

    The RE matches the whole line. Remember the special meaning of ampersand (&) in the replacement string

    Hide

  23. s/^\([^,]*\), *\(.*\)/\2 \1/
    

    Answer

    A.B. Smith
    

    This RE also matches the whole line. The strings matched by the parts of the RE inside the escaped parentheses (\() and (\)) are stored away to be used in the replacement string. The RE in the first set of parentheses matches Smith. The RE in the second set matches A.B. -- the replacement string, therefore, is the second match, a space and the first match.

    Notice that some parts of the RE (^ and ,followed by one or more spaces) are outside the parentheses. These parts take place in the RE matching but "disappear" because they are not in the replacement string.

    Hide

ANSWERS

  1. Smythe, A.B.
    
  2. -mith, A.B.
    

    Remember . matches any character. Because matching starts on the left it actually matches our S so that is replaced with a single -.

  3. -----------
    

    Because of the g suffix, ed makes eleven substitutions:

    Substitution | Match | Replacement
    -------------+-------+------------
               1 |   S   | -
               2 |   m   | -
               3 |   i   | -
    etc
              11 |   .   | -
    

    That is why the replacement string appears eleven times.

  4. -
    

    The RE (zero or more of any character) matches Smith, A.B. which is chopped out and replaced with a single -.

  5. Smith. A.B.
    

    A dot in the replacement has no special meaning!

  6. Smith, A,B,
    

    The backslash stops the dot being a metacharacter so it simply matches a literal dot. The g suffix causes two substitutions:

    Substitution | Match     | Replacement
    -------------+-----------+------------
               1 | . after A | ,
               2 | . after B | ,
    
  7. Smoth, A.B.
    

    The dot matches any character but the m and t provide a context, making it match the o.

  8. Smoth, A.B.
    

    Same command apart from the delimiters

  9. Jean Smith, A.B.
    
  10. Smith, A.B. (Ms)
    
  11. SXith, A.B.
    

    The RE matches the m which is the first character (from the left) that is an m or an i. The RE matches just a single character and that is replaced with the replacement string (X).

  12. SXith, A.B.
    

    The RE again matches the m because it is the first character that is an i or an m.

  13. SXth, A.B.
    

    The first [im] matches the m. The [im]* matches zero or more is or ms. Because the m has been matched, the first possibility for matching is the i. The next possibility is the t but that fails. So, the whole RE matches mi which is chopped out and replaced with the replacement string (X).

  14. XSmith, A.B.
    

    This is a catch question that demonstrates a common mistake. The RE matches zero or more ms or is. However, there are none of them before the S so nothing is removed from before the S and the replacement string is inserted at that place.

    Think of this a being like the "no elephants" that can be found in every classroom. In general, REs ending with an asterisk have to follow, or be followed by some context to anchor them somewhere in the line. The first [mi] in the previous question anchored the rest of the RE after the m.

  15. Xmith, A.B.
    

    An RE of this form matches just a single character.

  16. SXith, A.B.
    
  17. SX, A.B.
    

    Another way of looking at this one is it consists of an RE that matches one character plus an RE that matches zero or more characters. The result is an RE that matches one or more characters.

    Generally, this RE matches a lower case word.

  18. Xmith, X.X.
    

    An earlier question with a suffix.

  19. XXXXXXXAXBX
    

    The RE represents a single character that is not an A or a B. The substitution is done eight more times because of the suffix.

  20. X, A.B.
    

    Because there is no context, the RE match begins before the S and continues to the h which is the last character before the comma. The matched character string is removed and replaced with a single X.

  21. XB.
    

    This is the previous RE again but to the right we have added: a comma, a space, any single character and a literal dot. To the left, the first caret (^) anchors the matching to the start of the line. Since that is the default, this extra caret makes no difference to the result. The longer, matched text is again replaced with replacement string.

  22. Smith, A.B. (Smith, A.B.)
    

    The RE matches the whole line. Remember the special meaning of ampersand (&) in the replacement string

  23. A.B. Smith
    

    This RE also matches the whole line. The strings matched by the parts of the RE inside the escaped parentheses (\() and (\)) are stored away to be used in the replacement string. The RE in the first set of parentheses matches Smith. The RE in the second set matches A.B. -- the replacement string, therefore, is the second match, a space and the first match.

    Notice that some parts of the RE (^ and ,followed by one or more spaces) are outside the parentheses. These parts take place in the RE matching but "disappear" because they are not in the replacement string.