#20: Multi-line sed search and replace
sed
is a very famous stream editor on UNIX systems. It is very powerful and versatile and makes manipulation of streams and text files very easy. However, it has a steep learning curve. Like vi
/vim
, sed
might seem very bulky at the beginning, but as soon as you begin to understand the tool, it makes your workflow very efficient.
sed
is built to process strings (either from STDIN or from a file) line by line. Therefore, you can't search for multiple lines in a way like this:
sed 's/foo\nbar/bla\nblub/'
That wouldn't work because the pattern space, in which all operations are performed, only contains one line at a time. Also replacing newline characters wouldn't work that way because they are stripped in pattern space. But there are several ways to work around this. I'll be showing you three ways of performing multi-line replacements.
The first way is that one, you normally find when googling for that topic. It makes use of the N
command, which reads the next line and appends it to the pattern space, separated by a newline character.
sed '/foo/{ N; s/foo\nbar/bla\nblub/ }'
This looks for foo
and, if found, appends the next line and does the replacement. But this method has a catch. If you have an input string like this, this won't work
foo foo bar
The first occurrence of foo
is found and the next line is appended to pattern space. But of course
foo foo
doesn't match the multi-line pattern, so the pattern space is replaced by the next line, which is
bar
but this doesn't match foo
. You see, this method is very rough and does only work in a few cases.
But we can optimize this. Besides pattern space, sed
also provides the so called hold space. This is just a temporary buffer on which no operations are performed. We can use this to read the whole input into it first and then replace the pattern space with the contents from hold space. That looks like this:
sed -n '1h; 1!H; ${ g; s/foo\nbar/bla\nblub/ p }'
That first reads the first line from pattern space into hold space (1h
) replacing all contents which currently exist in hold space. Then all lines except line 1 are appended to hold space (1!H
). The reason why we cannot only use 1,$H
is that this would result in a blank line at the beginning since hold space has not been emptied. As soon as the end of the string is reached (range marker $
), a subclause is opened which writes contents from hold space into pattern space (g
) and does the replacement. Because we have read everything into hold space and then into pattern space, we would get double output. To avoid this, the parameter -n
(no output) is set and the edited final string is printed manually with the p
command from within the subclause. This method works remarkably well, but you should note that it is much slower if the stream/file is very long. One advantage of sed
over many other tools is that it reads line by line, so it doesn't take more memory when working on long strings. This advantage is abrogated with this method. Keep that in mind.
Another way that came to my mind is to omit the hold space and read everything directly into pattern space. That's a mixture of method one and two.
sed '1!N; s/foo\nbar/bla\nblub/'
sed
automatically reads line 1 into pattern space, so we only have to append all the others. We do this with 1!N
, which appends all lines except the first one to pattern space. Then the replacement is performed. Done! Short and nifty. The only problem with this method is that is has problems with multiple replacements (g
flag). For this better use the second method. Of course, you can also use this with the -n
parameter and p
command, but then you have to set a semicolon after the replacement command, otherwise you'd only get the parts which have been replaced. The rest of the string would not be printed to the screen. So
sed -n '1!N; s/foo\nbar/bla\nblub/ p'
is something different than
sed -n '1!N; s/foo\nbar/bla\nblub/; p'
The first one would only output
bla blub
and the second one
foo bla blub
That's it, hope you learned a bit.
Read more about sed
multi-line replacement:
- GNU.org: sed info manual
- Unix Sed Tutorial: Multi-Line File Operation
- sed and Multi-Line Search and Replace
Trackbacks
#remember: Multi-line sed search and replace → https://t.co/89mvNpDmEL
RT @reflinux: #Advent series "24 Short #Linux #Hints", day 20: Multi-line sed search and replace http://bit.ly/g53KTc