#15: Named and unnamed pipes

Posted by | Comments (3) | Trackbacks (2)

Pipes are used very often on UNIX systems. They connect two programs in that way that the STDOUT of the first program becomes the STDIN of the second program. For instance, this

program1 | program2

makes the STDOUT of program1 to the STDIN of program2. Because pipes are so simple to understand and so very convenient, they are often used when they are not necessary (e.g. in combination with cat), as I described in #4: Cut your use of cat published earlier this month. But that's not topic for this article. In this article I'll be showing you what you can accomplish with pipes.

Most of the time pipes are used to connect two ordinary programs, but you can also send a string to the STDIN of a program:

echo 'Hello World!' | myprogram

This construct is very common. Of course, you could also redirect this string to a file before, which then acts as the STDIN source, but that's very inefficient.

Another common use case is output filtering with programs like grep or sort. For instance

ps auxw | grep blub | sort -r

to show only entries which contain blub sorted in reverse order. Another use case is to redirect the output of a program to a console pager for better reading experience:

myprogram | less

This is only meaningful if this output is generated dynamically. Constructs with cat to read a file are anything but expedient.

Unnamed pipes are very easy to understand and maybe any Linux guy has used them already. But there is another very interesting topic and that's named pipes. Named pipes, also called FIFOs (first in, first out), are exactly the same but they have a file name. A FIFO is a type of file which doesn't save contents persistently but pipes them between a writing and a reading process. When a process writes to the FIFO another program can read from it in the same order the first program has written to it. So the first character written is the first character read. This doesn't sound very useful and indeed, in most cases unnamed pipes are the better way, but there are some cases when named pipes can make your life easier. Named pipes are very handy if you want to make two programs communicating over files. Sure, if they create STDOUT and receive STDIN, you'd better use unnamed pipes in most cases, but imagine, a program only writes to a file. For instance, if you have a program program1, which writes to the file out.txt, and you want to process this with program2, you could either first run program1 and then program2 or you could make out.txt a FIFO:

mkfifo out.txt
program1 --file out.txt

Now open another terminal and run

< out.txt program2

The advantage is that you don't waste disk space with superfluous temporary files. The example above is still very abstract but we can concretize it. Imagine, you have a program myprogram that creates a large log file and you want to compress it. Now you could first run myprogram and compress the log file afterwards, but that would take much disk space. Instead, you could write the log contents to a named pipe and compress them directly:

mkfifo out.log
myprogram --logfile out.log

Open another terminal and run

< out.log bzip2 -9 -c > out.log.bz2

You can also do this the other way round to read from a compressed file. One good example for this, which can be found on Wikipedia.org, is importing a compressed MySQL dump into your database. First prepare the FIFO:

mkfifo /tmp/dump_uncomp
bunzip2 -c sql_dump.sql.bz2 > /tmp/dump_uncomp

And then import this file into MySQL:

LOAD DATA INFILE '/tmp/dump_uncomp' INTO TABLE `mytable`;

That's basically it. On the console named pipes are not used very often but sometimes they can be really handy.

Read more about named and unnamed pipes:


Manko10 sent a Trackback on : (permalink)

RT @reflinux: #Advent series "24 Short #Linux #Hints", day 15: Named and unnamed #pipes http://bit.ly/hvrlX7

robo47 sent a Trackback on : (permalink)

RT @reflinux: #Advent series "24 Short #Linux #Hints", day 15: Named and unnamed #pipes http://bit.ly/hvrlX7


There have been 3 comments submitted yet. Add one as well!
PuZZleDucK wrote on : (permalink)
Love the series Janek. I'm curious to your reasoning behind the statement "you could also redirect this string to a file before, which then acts as the STDIN source, but that's very inefficient". I would have thought using redirection would be the more efficient way to do it as "echo 'Hello World!' | myprogram" would need to spawn a process for the echo command. Would "myprogram < 'Hello World!'" implicitly spawn an extra process anyway? Cheers
Janek Bevendorff
Janek Bevendorff wrote on : (permalink)
Yes and no. Maybe I worded this a bit confusing. What I meant is to write the output to a file first and then use this as the input stream for another program. That also needs two processes but first of all, it's very time-consuming, i.e. not very time-efficient. And of course you have to create and then read a file from disk which you normally wouldn't need at all. Another possibility would be to use something like process or command substitution, but the shell handles these as pipes or subshells, too (i.e. as separate processes). So this would be more time-efficient, but would have the overhead of a normal pipe. If you just want to write or read a file, however, normal input and output redirection operators are always more efficient. Of course.
bhupesh wrote on : (permalink)
Another link on technical forum: http://www.writeulearn.com/category/inter-process-communicationipc/

Write a comment:

E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

By submitting a comment, you agree to our privacy policy.

Design and Code Copyright © 2010-2024 Janek Bevendorff Content on this site is published under the terms of the GNU Free Documentation License (GFDL). You may redistribute content only in compliance with these terms.