<?xml version="1.0" encoding="utf-8" ?>

<rdf:RDF 
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:admin="http://webns.net/mvcb/"
   xmlns:content="http://purl.org/rss/1.0/modules/content/"
   xmlns:dc="http://purl.org/dc/elements/1.1/"
   xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
   xmlns:wfw="http://wellformedweb.org/CommentAPI/"
   xmlns="http://my.netscape.com/rdf/simple/0.9/">
<channel>
    <title>Refining Linux</title>
    <link>http://www.refining-linux.org/</link>
    <description>Take your Linux to the next level</description>
    <dc:language>en</dc:language>

    <image rdf:resource="http://media.refining-linux.org/design/tux_painting.png" />

    <items>
      <rdf:Seq>
        <rdf:li resource="http://www.refining-linux.org/archives/66/guid/" />
        <rdf:li resource="http://www.refining-linux.org/archives/65/guid/" />
        <rdf:li resource="http://www.refining-linux.org/archives/64/guid/" />
        <rdf:li resource="http://www.refining-linux.org/archives/63/guid/" />
        <rdf:li resource="http://www.refining-linux.org/archives/62/guid/" />
        <rdf:li resource="http://www.refining-linux.org/archives/61/guid/" />
        <rdf:li resource="http://www.refining-linux.org/archives/60/guid/" />
        <rdf:li resource="http://www.refining-linux.org/archives/59/guid/" />
        <rdf:li resource="http://www.refining-linux.org/archives/58/guid/" />
        <rdf:li resource="http://www.refining-linux.org/archives/57/guid/" />
      </rdf:Seq>
    </items>
</channel>

<image rdf:about="http://media.refining-linux.org/design/tux_painting.png">
        <url>http://media.refining-linux.org/design/tux_painting.png</url>
        <title>RSS: Refining Linux - Take your Linux to the next level</title>
        <link>http://www.refining-linux.org/</link>
        <width>144</width>
        <height>93</height>
    </image>


<item rdf:about="http://www.refining-linux.org/archives/66/guid/">
    <title>Performing push backups – Part 2: rsnapshot</title>
    <link>http://www.refining-linux.org/archives/66/Performing-push-backups-Part-2-rsnapshot/</link>
    <description>
    <![CDATA[
      <p>After I discussed a possible backup solution using <code>rdiff-backup</code> in <a href="/archives/65/Performing-push-backups-Part-1-rdiff-backup/">the last part of this series</a> I want to show you the second tool which is <code>rsnapshot</code>.</p>

<p>As I already pointed out, I'm not using <code>rdiff-backup</code> anymore. The reason is mainly that it is simply too slow. I'm using a Raspberry Pi as my NAS and it is absolutely not capable of handling larger backups with <code>rdiff-backup</code>. It works for smaller backup sizes, but not for my entire home directory. Even when I pushed the initial full backup directly to the backup disk (not using my Raspberry), all future incremental backups were still unbearably slow. Even when no files changed at all, it took hours over hours for simply comparing all the files I had in my home directory to those on the NAS, whereas a full comparison using <code>rsnapshot</code> is done within five to ten minutes. Now keep this in mind and look at the fact that incomplete backups made with <code>rdiff-backup</code> can't be resumed. You could imagine that in the end you wouldn't have any backup at all. Basically all <code>rdiff-backup</code> would do is to compare and push your files over the day and abort in the evening when you shut down your workstation. And then the next day it would spend all the time reverting the incomplete backup and running another one which might not finish either.</p>

<p>So this is the main reason I stopped my experiments with <code>rdiff-backup</code>. It was a nice time, but I finally moved on. Therefore say hello to our new precious star: <code>rsnapshot</code>!</p>

<p><strong>TL;DR:</strong> I have created a GitHub project page with a much more mature version of the scripts I discuss throughout this article. It is linked <a href="#SecTheGitHubProjectPage">at the end</a>.</p>

<p>Basically we want to do exactly the same with <code>rsnapshot</code> that we did with <code>rdiff-backup</code>. We want to push our backups from the client to the NAS and we want to keep the privileges separated. That means, there should be a general backup of all global system files run by root and then the backups of the individual home directories run by the users themselves (i.e. with their privileges and UIDs). The whole backup process should be as flexible as possible and each user should decide by himself which of his own files to include in a backup and which to explicitly exclude. Therefore we will again read the list of files and folders which are to be backed up from special files in a) the global system <code>etc</code> folder and b) the users' home directories.</p>

<p>One sticking point we have to solve is that <code>rsnapshot</code> by itself does not support push backups at all. Since it works with hardlinking it needs to operate on a local file system. But fear not, there is a simple yet effective solution, which I (to be honest) borrowed from <a href="http://www.mad-hacking.net/documentation/linux/reliability/backup/using-rsnapshot-laptops.xml">mad-hacking.net</a>. <code>rsnapshot</code> is based on <code>rsync</code> and <code>rsync</code> by itself is capable of pushing files to another host over SSH by starting a daemon process on the remote side. Furthermore, <code>rsnapshot</code> has the feature to keep the syncing and the rotation process separate, which is a good idea anyways when we push over the network (imagine all the bad things that might happen <img src="/templates/reflinux-2012/img/emoticons/smile.png" alt=":-)" style="display: inline; vertical-align: bottom;" class="emoticon" />). With the <code>sync_first</code> option enabled, <code>rsnapshot</code> uses a directory called <code>.sync</code> as the source for the rotation. This is the directory we will push our files to with <code>rsync</code>. That means we will use <code>rsnapshot</code> only for the rotation, not the real push backup itself.</p>

<h2>The server side</h2>
<p>Again, let's start with the server side, i.e. the NAS. As already done in the last part I will call the server/NAS <code>bellatrix</code> and the client that pushes its files <code>altair</code>.</p>

<p>The storage which will hold all the backups is mounted under <code>/bkp</code> and that's were our backup users will have their home directories. For each user on each client we will have one user on the server. So first of all create a new group and two new users for backups from <code>altair</code> on the server:</p>

<pre class="sourcecode"><code class="lang-zsh">groupadd backup
useradd -G backup -b "/bkp" -m -p '*' -s /bin/sh "altair-root"
useradd -G backup -b "/bkp" -m -p '*' -s /bin/sh "altair-johndoe"
</code></pre>

<p>The first account will be for backups run as <code>root</code> on <code>altair</code>, the second one for backups run as the user <code>johndoe</code>.</p>

<p>Next we need to prepare the home directories. I'll do it for <code>altair-root</code>, the process is the same for <code>altair-johndoe</code> and any other user of course.</p>

<p>The backup files should all go to the directory <code>$HOME/files</code> and as described above we will use a folder called <code>.sync</code> inside of it for the actual syncing. Thus we need to create them both:</p>

<pre class="sourcecode"><code class="lang-zsh">mkdir -p "/bkp/altair-root/files/.sync"</code></pre>

<p>To enable <code>root</code> on <code>altair</code> to log into the NAS as <code>altair-root</code> we also need to set up an SSH key. But of course, we don't want to grant full shell access but only an <code>rsync</code> daemon instance. Therefore generate a new SSH key on the client and create a folder called <code>.ssh</code> inside <code>/bkp/altair-root</code> on the server. Inside that folder create a file called <code>authorized_keys</code> with the following content:</p>

<pre class="sourcecode"><code class="lang-zsh">command="/usr/bin/rsync --server --daemon --config='/bkp/altair-root/rsync.conf' ." ssh-rsa AAAAB3NzaC1yc...</code></pre>

<p>(replace the last part with the proper public key, of course)</p>

<p>Finally protect the file by setting the file owner to root:</p>

<pre class="sourcecode"><code class="lang-zsh">chown -R root:root /bkp/altair-root/.ssh</code></pre>

<p>This will ensure that a user that logs into <code>bellatrix</code> with that SSH key can only start an <code>rsync</code> daemon with the specified configuration file <code>/bkp/altair-root/rsync.conf</code>. Now create that file with these contents:</p>

<pre class="sourcecode"><code class="lang-ini">[push]
uid = altair-root
gid = altair-root
path = /bkp/altair-root/files/.sync
use chroot = 0
read only = 0
write only = 1
fake super = 1
max connections = 1
lock file = /bkp/altair-root/rsyncd.lock
post-xfer exec = /usr/local/bin/rs-rotate "/bkp/altair-root/rsnapshot.conf"

[pull]
uid = altair-root
gid = altair-root
path = /bkp/altair-root/files
use chroot = 0
read only = 1
fake super = 1</code></pre>

<p>Save the file and make root own it, too.</p>

<p>The configuration directives above define two basic <code>rsync</code> modules, one named <code>push</code> and one named <code>pull</code> (which is read-only and intended for restoring backups from the NAS). For both modules the config file sets the proper target directories, which is the <code>files/.sync</code> folder for the actual backup process and simply the <code>files</code> directory for restoring backups. Processes connecting via <code>rsync</code> to these modules won't be able to escape these directories easily. One very important option in the <code>push</code> module is the <code>fake super</code> option. Since we're running with basic user privileges, we can't properly set things like ownership. Luckily, <code>rsync</code> is able to save this information to the extended user file attributes (xattrs) when this option is enabled. In the past this required the file system to be mounted with the <code>user_xattr</code> option, but if you're using Ext4, it should also work without since it's enabled by default. If you get a <em>permission denied</em> error, though, you might want to remount the file system with that option.</p>

<p>All the other lines should be more or less self-explanatory, but another very important one is</p>

<pre class="sourcecode"><code class="lang-ini">post-xfer exec = /usr/local/bin/rs-rotate "/bkp/altair-root/rsnapshot.conf"</code></pre>

<p>This tells <code>rsync</code> to run the specified command after it finished the syncing process. We will use this to trigger the rotation. The file <code>/usr/local/bin/rs-rotate</code> (could be any other name and location, too) is a very simple shell script:</p>

<pre class="sourcecode"><code class="lang-zsh">#!/bin/sh

if [ "$1" == "" ]; then
        echo "Usage: $(basename $0) &lt;rsnapshot config&gt;"
        exit
fi

if [ "$RSYNC_EXIT_STATUS" == "" ]; then
        echo "This script is intended to be run as rsync post-xfer hook." 2>&amp;1
        exit 1
fi

if [ $RSYNC_EXIT_STATUS -eq 0 ]; then
        rsnapshot -c "$1" push
fi</code></pre>

<p>When the <code>rsync</code> process exited cleanly (i.e. the syncing was entirely successful) it will ask <code>rsnapshot</code> to perform a rotation using the specified configuration file (which we also need to create):</p>

<pre class="sourcecode"><code class="no-highlight">config_version  1.2

cmd_cp			/usr/bin/cp
cmd_rm			/usr/bin/rm
cmd_rsync		/usr/bin/rsync
cmd_logger		/usr/bin/logger

retain			push			2
retain			daily			7
retain			weekly			4
retain			monthly			2

verbose			2
loglevel		3
one_fs			1
sync_first		1

snapshot_root	/bkp/altair-root/files
logfile			/bkp/altair-root/rsnapshot.log
lockfile		/bkp/altair-root/rsnapshot.pid
backup			/bkp/altair-root/files/.sync	./</code></pre>

<p>Save this to <code>/bkp/altair-root/rsnapshot.conf</code>.</p>

<p>The syntax is explained quickly. All entries are separated by tabs (not spaces!) and each line with text contains one configuration directive. The first few lines simply define were to find certain command line tools. More interesting are the <code>retain</code> lines. These specify the different backup levels and how many increments are kept. For the push backups we will keep 2 increments. The second increment will then be the basis for the first daily increment. Since we keep 7 dailies, the seventh daily increment will server as the basis for the first weekly and so on.</p>

<p>Important to note is the <code>sync_first</code> option which needs to be set. Otherwise our whole system would not work. With this option set, though, the syncing would be invoked by running <code>rsnapshot</code> with the <code>sync</code> command which we'll never do. Any other invocation of <code>rsnapshot</code> will simply rotate our files.</p>

<p>Finally the last four lines set the directories <code>rsnapshot</code> should work on. The very last line tells <code>rsnapshot</code> to use the <code>.sync</code> directory as a source and back it up to the current working dir, which is set by <code>snapshot_root</code>. So whichever files are in <code>/bkp/altair-root/files/.sync</code> will be used for rotation.</p>

<p><strong>Side note:</strong> you might also want to split the config file and put everything above <code>snapshot_root</code> in a separate file and reference it via a <code>include_conf</code> directive. This would enable you to only write a very minimal config file for each backup user and simply reuse the options which are the same for everyone.</p>

<p>To invoke a rotation for a specific backup level, use <code>rsnapshot &lt;level&gt;</code> which is exactly what we did in the <code>rsync</code> post-xfer hook script above for the <code>push</code> level.</p>

<p>All the other levels can be rotated by a cron script that is run every day. But be careful: since we rely on push backups, you should only perform a rotation via cron when there are enough increments of the preceding level. Otherwise you'd successively delete your older backups. For instance, if we take the configuration above and keep two push increments and seven daily increments, the rotation cron would delete the seventh daily increment, rotate all preceding dailies by one number and then make the second push increment the new daily.0. No problem so far. But if you ran the rotation again without creating a new push increment first, <code>rsnapshot</code> would think that you had deleted all your files. The result would be that the last daily would be deleted again and all the earlier dailies would be rotated by one increment. But there would be no new daily.0 since there is no push.1, only a push.0. So after five more rotations you'd be left with only your push.0 and no dailies at all.</p>

<p>Therefore you should carefully check the number of existing increments of the preceding level first before doing a rotation. To do this you could put something like this in your cron script:</p>

<pre class="sourcecode"><code class="lang-zsh">config=$(cat "path/to/the/rsnapshot/config.conf")

# Get number of preceding increments
config=$(echo "${config}" | grep -P '^retain\t')
config=$(echo "${config}" | grep -oPz "retain\t+(\w+)\t+(\d+)\nretain\s+${1}\t+" | sed -n 1p)
preceding_name=$(echo "${config}" | awk '{ print $2 }')
preceding_number=$(($(echo "${config}" | awk ' { print $3 }') - 1))

if [ "${preceding_name}" != "" ] &amp;&amp;
   [ -d "path/to/backup/dir/${preceding_name}.${preceding_number}" ]; then
        # Perform rotation
fi</code></pre>

<p>Of course this also means that <em>“daily”</em> doesn't necessarily main exactly <em>“daily”</em> anymore. Now it rather means something like <em>“backup from the day before the last push backup”</em>.</p>

<h2>The client side</h2>
<p>We have the server side, now we need the client side, which is simple. Very simple. It is exactly the same as described in the last part of the series, you only need to replace the <code>rdiff-backup</code> command with this:</p>

<pre class="sourcecode"><code class="lang-zsh">rsync \
    --rsh=ssh
    --archive \
    --acls \
    --delete \
    --delete-excluded \
    --include-from="${home_dir}/.rsnapshot-backup-filelist"
    --exclude="*" \
    / \
    "$(hostname)-${username}@${BACKUP_HOST}::push"</code></pre>

<p>You do nothing more than <code>rsync</code>-ing your files to the server using the <code>push</code> module we defined in your <code>rsync.conf</code> on the server. Similarly you can pull files from there with <code>rsync</code> pretty much the same way using the <code>pull</code> module:</p>

<pre class="sourcecode"><code class="lang-zsh">rsync \
    --rsh=ssh \
    --archive \
    "$(hostname)-${username}@${BACKUP_HOST}::pull/file/on/the/server ./where/the/file/should/go/to"
</code></pre>

<p>Of course you can also use the short options <code>-e</code> and <code>-a</code> instead of <code>--rsh</code> and <code>--archive</code>.</p>

<p>One more thing to note: the file that contains the list of files which should be backed up (<code>~/.rsnapshot-backup-filelist</code> in this case) works a little differently. By default all directories are excluded which are not explicitly included. That is also true for parent directories. So to include, e.g., <code>/home/foo/bar/baz</code> and all directories below you'd have to write:</p>

<pre class="sourcecode"><code class="no-highlight">/home
/home/foo
/home/foo/bar
/home/foo/bar/baz/***</code></pre>

<p>Simply writing <code>/home/foo/bar/baz/***</code> wouldn't be enough since <code>/home</code> is not included explicitly.</p>

<p>Note the slashes: no slash at the end means: <em>“only the directory, not its contents”</em>. A slash at the end, though, means: <em>“the contents of this directory”</em>. The version I used with the three asterisks means <em>“this directory, the files inside it and inside all subdirectories”</em>. So mind the little differences or otherwise your backup might be different from what you intended. For more information about <code>rsync</code> globbing patterns consult the <code>FILTER RULES</code> section in the <code>rsync(1)</code> man page.</p>

<p>For reference: my file currently looks about like this:</p>

<pre class="sourcecode"><code class="no-highlight">- *.swp
- *.tmp
- .directory
- Thumbs.db
- desktop.ini
- Desktop.ini
- .DS_Store
- *~
- .Trash/***

- /home/janek/.Xauthority
- /home/janek/.xsession-errors
- /home/janek/.cache
- /home/janek/.dbus
- /home/janek/.codeintel
- /home/janek/.zsh_history
- /home/janek/.bash_history
- /home/janek/.recently-used
- /home/janek/.pulse-cookie
- /home/janek/.config/pulse/***
- /home/janek/.local/tmp/***
- /home/janek/.dropbox/***
- /home-accel/janek/.kde4/cache-*/***
- /home-accel/janek/.kde4/socket-*/***
- /home-accel/janek/.kde4/tmp-*/***
- /home-accel/janek/.kde4/share/apps/nepomuk/***
- /home-accel/janek/.mozilla/firefox/*/Cache/***
- /home-accel/janek/.thunderbird/*/Cache/***

/home
/home/janek/***
/home-accel
/home-accel/janek/***
/srv
/srv/http
/srv/http/virtual
/srv/http/virtual/janek/***</code></pre>

<p>You see: as with <code>rdiff-backup</code> you can also prefix lines with a <code>-</code> to specify explicit excludes. Lines starting with a <code>+</code> or no sign at all are treated as includes.</p>

<h2>Some more polishing</h2>
<p>That's pretty much it, but we can still tweak one thing ore two.</p>

<h3>Providing read-only SFTP access</h3>
<p>Currently we can only restore files from the backup server using <code>rsync</code>. This is okay for the restoring process itself, but it makes it hard to browse your backups. Wouldn't it be convenient if we could also mount the backup folder from the server using SFTP/SSHFS? That's very much possible.</p>

<p>You only need to do two things: first of all modify the <code>authorized_keys</code> file on the server like this:</p>

<pre class="sourcecode"><code class="lang-zsh">command="/usr/bin/rs-run-ssh-cmd '/bkp/altair-root'" ssh-rsa AAAAB3NzaC1yc...</code></pre>

<p>and then create the script <code>/usr/bin/rs-run-ssh-cmd</code>:</p>

<pre class="sourcecode"><code class="lang-zsh">#!/bin/sh

home_dir=$1

if [ "${SSH_ORIGINAL_COMMAND}" == "internal-sftp" ] || [ "${SSH_ORIGINAL_COMMAND}" == "/usr/lib/ssh/sftp-server" ]; then
    cd "${home_dir}/files"
    exec /usr/lib/ssh/sftp-server -R
else
    exec /usr/bin/rsync --server --daemon --config="${home_dir}/rsync.conf" .
fi

echo "Session failed." >&amp;2

exit 1</code></pre>

<p>This will start an SFTP server when needed, otherwise simply the <code>rsync</code> daemon as before.</p>

<h3>Chroot users into <code>/bkp</code></h3>
<p>If you want a little more security, you can also chroot all backup users into <code>/bkp</code>. For this to work you need to add these lines to your <code>/etc/ssh/sshd_config</code> on the server:</p>

<pre class="sourcecode"><code class="no-highlight">Match Group backup 
        ChrootDirectory /bkp/</code></pre>

<p>After you've done that you need to make all files from outside the directory that are necessary for the services to operate, available inside the chroot environment. That means you either need to copy them to <code>/bkp</code> or use bind mounts. Copying has the advantage that you can copy only those files that are really needed, but it also creates a lot of duplicate files. I prefer bind mounts most of the time. Hardlinks are usually not a very good idea.</p>

<p>For <code>rsnapshot</code> you need to bind mount at least <code>/bin</code>,  <code>/usr/bin</code>, <code>/lib</code>, <code>/usr/lib</code> and <code>/usr/share/perl5</code>:</p>

<pre class="sourcecode"><code class="lang-zsh">mkdir -p "/bkp/"{"bin","usr/bin","lib","usr/lib","/usr/share/perl5"}
mount -o bind "/bin" "/bkp/bin"
mount -o bind "/usr/bin" "/bkp/usr/bin"
mount -o bind "/lib" "/bkp/lib"
mount -o bind "/usr/lib" "/bkp/usr/lib"
mount -o bind "/usr/share/perl5" "/bkp/usr/share/perl5"</code></pre>

<p>If you're using the SFTP server, you also need <code>/dev</code>. Additionally a copy of the <code>/etc/passwd</code> file is necessary for the UID mapping, but you only need to keep those users which should be able to log into the chroot. So for your two backup users <code>altair-root</code> and <code>altair-johndoe</code> the following minimal <code>/bkp/etc/passwd</code> file would be enough:</p>

<pre class="sourcecode"><code class="no-highlight">altair-root:x:1001:1001::/bkp/altair-root:/bin/sh
altair-johndoe:x:1002:1002::/bkp/altair-johndoe:/bin/sh</code></pre>

<p>(the UIDs and GIDs should of course correspond to the real UIDs and GIDs of the users)</p>

<p><strong>Side note:</strong> OpenSSH provides an <code>internal-sftp</code> subsystem that works without any bind mounts and additional <code>passwd</code> files. But unfortunately, it is not possible to trigger it from a shell script. So in order to use it you'd either have to remove the command restriction from the <code>authorized_keys</code> file or use a completely different user for SFTP logins.</p>

<h2 id="SecTheGitHubProjectPage">And finally: The GitHub project page</h2>
<p>Because <code>rsnapshot</code> is what I finally settled upon, I needed a very flexible yet reliable backup system that is easy to maintain. So I created a bunch of large shell scripts which do all the work I described above and a lot more in a pretty convenient way.</p>

<p>I uploaded everything to GitHub where you can download it: <a href="https://github.com/Manko10/rs-backup-suite">rs-backup-suite on GitHub</a></p>

<p>Feel free to test it, modify it and redistribute it if you like.</p>      <div><img src="//www.refining-linux.org/stat/piwik.php?idsite=3&amp;rec=1&amp;action_name=Performing%20push%20backups%20%E2%80%93%20Part%202%3A%20rsnapshot" height="1" width="1" alt=""></div>
    ]]>
    </description>

    <dc:publisher>Refining Linux</dc:publisher>
    <dc:creator>nospam@example.com (Janek Bevendorff)</dc:creator>
    <dc:subject>
    Backup solutions, Userland, </dc:subject>
    <dc:date>2013-04-03T15:04:00Z</dc:date>
    <wfw:comment>http://www.refining-linux.org/wfwcomment.php?cid=66</wfw:comment>
        <slash:comments>0</slash:comments>
        <wfw:commentRss>http://www.refining-linux.org/rss.php?version=1.0&amp;type=comments&amp;cid=66</wfw:commentRss>
    
    <dc:subject>backup</dc:subject>
<dc:subject>howto</dc:subject>
<dc:subject>nas</dc:subject>
<dc:subject>rdiff-backup</dc:subject>
<dc:subject>rsnapshot</dc:subject>
<dc:subject>server</dc:subject>

</item>
<item rdf:about="http://www.refining-linux.org/archives/65/guid/">
    <title>Performing push backups – Part 1: rdiff-backup</title>
    <link>http://www.refining-linux.org/archives/65/Performing-push-backups-Part-1-rdiff-backup/</link>
    <description>
    <![CDATA[
      <p>Backups are a very vital part of every computer system, be it a corporate PC network or simply your local workstation. Unfortunately, they are often neglected, although everyone knows how important they are. The “I haven't had any bad incidences yet, but I know I really should… guess what… I'll do it next week” attitude is only too well known by everybody, including myself.</p>

<p>Performing backups is a tedious process if done wrong. Thus backups need to done automatically in the background without any user intervention. As soon as someone feels the need to do something in order to get his stuff backed up, he will ultimately end up with no backup at all (and probably a bad conscience he only forgets too fast).</p>

<p>Not surprisingly, there are a bunch of tools you can use to crate your backups, but seldom they are just perfect. Often they are simply what I called them: “tools”. You have to know how to use them in order to build a working backup infrastructure on them. There are some plug'n'play backup programs out there such as Apple's Time Machine, but if you're like me and want to have a little more control over your backups then you have to think a little more about it.</p>

<p>In this little two-part article series (<a href="/archives/66/Performing-push-backups-Part-2-rsnapshot/">Part 2</a>) I will present two tools I've been playing around with a lot and I'll show you how you can use them to set up your own personal NAS with a spare piece of hardware such as a <a href="http://www.raspberrypi.org/">Raspberry Pi</a>. No need for any expensive special storage system.</p>

<p>The two tools are in order: <code><a href="http://rdiff-backup.nongnu.org/">rdiff-backup</a></code> and <code><a href="http://www.rsnapshot.org/">rsnapshot</a></code>. Both have their strengths and weaknesses. Let me give you a list of their pros and cons.</p>

<h2>The two candidates</h2>

<h3>rdiff-backup:</h3>
<p><strong>Pros:</strong></p>
<ul class="compare-pros">
    <li>Very small backup sizes since it only stores reverse deltas</li>
    <li>Has a snapshot file system with <a href="http://code.google.com/p/rdiff-backup-fs/">rdiff-backup-fs</a> making it totally transparent to the user</li>
    <li>Saves all file permissions/attributes separately by default</li>
</ul>
<p><strong>Cons:</strong></p>
<ul class="compare-cons">
    <li>Very CPU demanding and therefore pretty slow</li>
    <li>Without the rdiff-backup-fs FUSE only the last increment is accessible directly via the file system</li>
    <li>Aborted backups can't be resumed (<code>rdiff-backup</code> will revert to the last successful backup)</li>
    <li>Increments are a little fragile and easily corrupted</li>
    <li><code>rdiff-backup</code> depends on the very same <code>rdiff-backup</code> version installed on the remote host</li>
</ul>

<h3>rsnapshot:</h3>
<p><b>Pros:</b></p>
<ul class="compare-pros">
    <li>Stores snapshots in different directories and hardlinks them, so you don't need any special tools to restore older increments</li>
    <li>Very robust, aborted or failed backups can be resumed without further consequences</li>
    <li>A lot faster with less resource consumption due to simpler calculations</li>
    <li>Actually uses <code><a href="http://rsync.samba.org/">rsync</a></code> making it very flexible</li>
</ul>
<p><b>Cons:</b></p>
<ul>
    <li>Files that only changed slightly are copied completely resulting in larger backup sizes</li>
    <li>More complex to set up (okay, I admit, it's just <code>rdiff-backup</code> being even simpler)</li>
</ul>

<p>You see: <code>rdiff-backup</code> has quite a few drawbacks. Additionally to those mentioned above I should also say that <code>rdiff-backup</code> hasn't experienced any visible development since 2009 whereas <code>rsnapshot</code> is still being worked on, which may or may not be an advantage (although the latest official release is from 2008).</p>

<p>To say it directly: I don't use <code>rdiff-backup</code> anymore. I had my experiments with it, but finally I ended up using <code>rsnapshot</code> for reasons I'll explain in the next part. But still I think there are valid uses cases when <code>rdiff-backup</code> is just the right tool for you. For instance, if disk space is a very crucial constraint, you might prefer it over <code>rsnapshot</code>. One thing I also quite liked is that you don't have just files but real timestamped increments. <code>rdiff-backup</code> enables you to restore files based on their backup date, not only the file modification date (which might be wrong). <code>rsnapshot</code>, however, doesn't store this information. The only thing there which lets you guess the backup date is the folder name of the increment.</p>

<h2>Push vs. pull backups</h2>
<p>In computer networks whose nodes are always on you usually see pull backup systems where the backup servers actively “pull” the backups over from the machines which are to be backed up. This has a major advantage and that is simplicity and therefore reliability and maintainability. You only have one (or a couple) of machines implementing the logic for running backups. All the other machines are simply passive in that regard and can concentrate their resources on their real tasks. That also makes changing things such as the backup destination easier since you don't need to change the settings on each and every machine but only on the backup storage system. It also gives the backup system the opportunity to coordinate its own resources because it can decide itself when it has enough free capacities to process yet another backup in parallel.</p>

<p>But pull backups also have their disadvantages, especially when the machines that should be backed up are not always on. Here it's very difficult for the backup system to get proper backups since some nodes might simply be down when it tries to run a backup. Therefore most consumer-targeted backup systems are “push” backup systems.</p>

<p>Another consideration in favor of push backups is privilege separation. For performing pull backups of non-publicly accessible files and folders the remote backup system needs to log into my machine as root. Although I can still restrict the commands it can run, I don't particularly like the idea of another machine logging into my PC as root in an automatic fashion. When doing push backups, however, I don't need to log in as root anywhere. I still need root privileges on my own machine, but on the remote side I can simply log in as an unprivileged user. Call me paranoid if you don't agree.</p>

<p>To put it in a nutshell: I will concentrate on push backups throughout this series for both <code>rdiff-backup</code> and <code>rsnapshot</code> (which is a little more fiddly in the beginning, but it can be done pretty well).</p>

<h2>Getting your hands dirty</h2>
<p>Let's start with some basics about how <code>rdiff-backup</code> basically works. Similar to <code>rsync</code> it connects to a remote machine and starts a daemon process (both can of course also perform local backups, but that's not within the scope of this article). The client then sends the file signatures to the daemon process which compares them to the local files. If differences are found that can't be reconstructed from the data already available in the local file, they are transferred over the wire. This requires <code>rdiff-backup</code> to be installed on both systems because it needs to operate on the native file system without having an additional wrapper in between such as SFTP (NFS would work, though). That being said: we can divide our backup model into two parts: the client part (being the machine to be backed up) and the server part (being the backup storage). Let's start with the server part since the client won't work without it.</p>

<p><em>From now on I will call the client <code>altair</code> and the server <code>bellatrix</code> (which are simply the hostnames I use in my local network for my main workstation and the NAS).</em></p>

<h3>The server part</h3>
<p>I'm using a Raspberry Pi as the NAS running Arch Linux for ARM on it, but you can also use any other piece of hardware that can run <code>rdiff-backup</code> and preferably a Linux system.</p>

<p>We could now follow two different approaches: we could either use one directory for each client to be backed up and let the local root push the whole system backup to that folder or we could let each user back up his own files individually. Although the second approach might look more error-prone and less reliable, it's the way I decided to go because it enables me to keep the global system backup and the backup of my personal data separate. That makes it possible that I can simply restore my own data without the need of having root access and I can also run manual backups myself, e.g. when I have worked on something important and don't want to wait until the cron daemon starts the next global backup.</p>

<p>That said we need to find a way to keep separate users for each client and each user on those clients. Therefore I decided to use usernames consisting of the client hostname and the client username. For instance, the backup user for the user <code>janek</code> on the client host <code>altair</code> would be <code>altair-janek</code>. Similarly the global system backup of configuration files and shared resources would be <code>altair-root</code> (both being plainly unprivileged, of course).</p>

<p>The storage (i.e. the backup hard drive or maybe even a RAID and/or LVM system) is mounted to <code>/mnt/storage</code> and contains a folder <code>bkp</code> which is bind mounted to <code>/bkp</code> (<code>mount -o bind /mnt/storage/bkp /bkp</code>) . This is were the home directories of the backup users will go to.</p>

<p>First of all let's create the backup user (we're working on <code>bellatrix</code>):</p>

<pre class="sourcecode"><code class="lang-zsh">useradd -b /bkp -m -p '*' -s /bin/sh altair-janek</code></pre>

<p>Next we need to add an SSH key to allow passwordless login to that account (login <em>with</em> a password has been disabled by setting the password hash to <code>*</code> in the step before, but you might also want to disable it entirely in your <code>/etc/ssh/sshd_config</code>). After you created an SSH key on your client, transfer the public part to the server and add it to <code>/bkp/altair-janek/.ssh/authorized_keys</code>.</p>

<p><strong>A word of warning:</strong> Don't use your normal SSH key! Create a new one you only use for the backup because it must not be encrypted. Then add a Host entry in <code>~/.ssh/config</code> to use the newly generated key for your backup host.</p>

<p>This already enables our client to log in, but we don't want to give him full shell access, so we need to restrict the commands he can run. He should be able to log in via SSH, but then only push his backup using <code>rdiff-backup</code>. To make this work, modify the public key entry in <code>/bkp/altair-janek/.ssh/authorized_keys</code> as follows:</p>

<pre class="sourcecode"><code class="lang-zsh">command="rdiff-backup --server --restrict '/bkp/altair-janek/files'" ssh-rsa AAAAB3NzaC1yc...</code></pre>

<p>This will allow the client to only start the <code>rdiff-backup</code> daemon which restricts it to push files to and pull them from <code>/bkp/altair-janek/files</code> (that folder needs to exist, of course). Additionally you might want to chroot the user into his home directory or at least into <code>/bkp</code>.</p>

<p>The last thing we need to do is to protect the SSH configuration by assigning ownership to root:</p>

<pre class="sourcecode"><code class="lang-zsh">chown -R root:root /bkp/altair-janek/.ssh</code></pre>

<p>Do all this for each client user you want to enable backups for.</p>

<h3>The client part</h3>
<p>Now that we have set up our server, we need to perform the actual backup, which is quite simple. The client doesn't need to do anything else than running the proper <code>rdiff-backup</code> command. But we want to do it in a little more sophisticated way. We need a script that can be run as either root (and then performs a backup of the whole system, including the home directories) or as an unprivileged user in which case only the own home directory is backed up. Additionally, the backup of a home directory when run as root should be performed under the account of that user (otherwise we would store the backup twice on the server and in a location not accessible by him).</p>

<p>To specify which files should be included in a backup I'll use files containing file matching patterns. All files and directories that are not specified inside those files will not be backed up. For the global system backup I use the file <code>/etc/default/rdiff-backup-filelist</code> which contains something like this:</p>

<pre class="sourcecode"><code class="no-highlight">/etc
/usr/etc
/usr/local
- /srv/http/virtual/**
/srv/http
/root</code></pre>

<p>This will include all files in <code>/usr/etc</code>, <code>/usr/local</code>, <code>/srv/http</code> and <code>/root</code>, but explicitly excludes all files below <code>/srv/http/virtual</code>. For a more detailed description of the format of this file have a look at the <a href="http://rdiff-backup.nongnu.org/examples.html#exclude">rdiff-backup examples page</a>.</p>

<p>Additionally each user has a <code>.rdiff-backup-filelist</code> file inside his home directory which specifies the files which should be backed up under this user. That file could look like this, for instance:</p>

<pre class="sourcecode"><code class="no-highlight">- **.tmp
- **.swp
- **/.directory
- /home/janek/.cache
/home/janek
/srv/http/virtual/janek</code></pre>

<p>Both files will be read from top to bottom and are “short-circuited”, i.e. the first match that is encountered will be used. Therefore <code>/home/janek</code> will be backed up, but without <code>/home/janek/.cache</code>.</p>

<p>Now we need to actually use these files. I wrote four shell functions for this:</p>

<pre class="sourcecode"><code class="lang-zsh">BACKUP_HOST="bellatrix"
BACKUP_ROOT="/bkp"

# Back up selected system files
backup_system() {
        cd /root
        rdiff-backup \
                --exclude-other-filesystems \
                --include-symbolic-links \
                --exclude-special-files \
                --create-full-path \
                --ssh-no-compression \
                --include-globbing-filelist /etc/default/rdiff-backup-filelist \
                --exclude '**' \
                / \
                "$(hostname)-root@${BACKUP_HOST}::${BACKUP_ROOT}/$(hostname)-root/files"
}

# Back up single home directory
# Expects the home directory as parameter
backup_single_home_dir() {
        local home_dir=$1
        local passwd_entry
        local username
        local backup_cmd
        
        # Don't create a backup if home directory doesn't belong to a "real" user
        passwd_entry=$(grep ":${home_dir}:[^:]*$" /etc/passwd)
        if [ "$passwd_entry" == "" ]; then
                return
        fi
        
        username=$(echo "${passwd_entry}" | cut -d ':' -f 1)
        
        # Don't back up home directory either, if no files are marked for backup
        if [ ! -e "${home_dir}/.rdiff-backup-filelist" ]; then
                return
        fi
        
        # Also don't create a backup if no SSH key exists
        if [ ! -e "${home_dir}/.ssh/id_rsa" ] &amp;&amp; [ ! -e "${home_dir}/.ssh/config" ]; then
                return
        fi
        
        cd "${home_dir}"
        
        backup_cmd="rdiff-backup \
                --create-full-path \
                --ssh-no-compression \
                --include-globbing-filelist \"${home_dir}/.rdiff-backup-filelist\" \
                --exclude '**' \
                / \
                \"$(hostname)-${username}@${BACKUP_HOST}::${BACKUP_ROOT}/$(hostname)-${username}/files\""

        # Lower privileges if running as root
        if [ $(id -u) -eq 0 ]; then
                su - "${username}" -c "${backup_cmd}"
        elif [ "$(id -u ${username})" == "$(id -u)" ]; then
                sh -c "${backup_cmd}"
        fi
}

# Back up all home dirs
backup_home_dirs() {
        local home_dir
        
        for home_dir in /home/*; do
                backup_single_home_dir "${home_dir}"
        done
}</code></pre>

<p>All we still need to do is to trigger the various backup functions depending on whether the user is root or not:</p>

<pre class="sourcecode"><code class="lang-zsh">if [ $(id -u) -eq 0 ]; then
        backup_system
        backup_home_dirs
else
        if [ "${HOME}" != "" ]; then
                backup_single_home_dir "${HOME}" "$(id -nu)"
        fi
fi</code></pre>

<p>Now the script backs up all files and folders that are specified in <code>~/.rdiff-backup-filelist</code> to the corresponding account on the backup host if run as a normal user. If the script is invoked as root, it will first back up all specified system files and then run backups for the home directories under the appropriate user accounts if a <code>~/.rdiff-backup-filelist</code> file exists.</p>

<p>I have created a tar archive with a little more polished version of the script above (download link at the end of the article). Feel free to use and modify it. The archive also contains a <code>server</code> directory containing scripts to automate the creation of backup users. It also contains a script <code>rb-remove-old-increments</code> which you can run as a cron job on the server from time to time to clean up old increments. Basically it does nothing more than running</p>

<pre class="sourcecode"><code class="lang-zsh">rdiff-backup --force --remove-older-than &lt;time&gt; &lt;folder&gt;</code></pre>

<p>for each home directory. The server scripts also have a global configuration file to avoid hardcoding of path names etc. The config file is located in <code>/usr/local/etc/default/rb-server-config</code>. A sample file is included in the tarball.</p>

<h2>Last adjustments: spinning down the hard drive</h2>

<p>The backup system is basically set up, but there is one thing left you might want to do. Especially when you're performing backups of your local system (which I suppose), you'll probably have the hard drive somewhere near you or in another room you use more often. Unfortunately, hard drives make noise and consume a lot of power, but you won't use it all the time, only a few times a day. So it doesn't need to be running all the time.</p>

<p>Some hard drives have an automatic spin-down timeout configured inside the firmware. If not, you can configure one with <code>hdparm -S timeout /dev/device</code>, but oftentimes that doesn't work. But you can use another tool: <code>sdparm</code> (probably not installed by default). You can use this little script and run it periodically via cron to spin down your hard drive after a given amount of time:</p>

<pre class="sourcecode"><code class="lang-zsh">#!/bin/sh
# Check if disk has been used since last check and spin it down if not

if [ "${1}" == "" ]; then
        echo "Usage: $(basename ${0}) <device file>"
        exit
fi

last_state_file="/tmp/storage-state-${1}"

touch $last_state_file
chmod 600 $last_state_file

new_storage_state=$(cat /proc/diskstats | grep "$1")
old_storage_state=$(cat $last_state_file)

if [ "$new_storage_state" = "$old_storage_state" ]; then
        sync
        sdparm --flexible --readonly --command=stop /dev/$1 2>&amp;1 > /dev/null
fi

echo "$new_storage_state" > $last_state_file</code></pre>

<p>Use it like this: <code>spin-down-storage sda</code>. It will spin down the hard drive if it has been idle since you last ran the script.</p>

<p>Another option (which I tend to) is to use the <code><a href="http://hd-idle.sourceforge.net/">hd-idle</a></code> tool, which does basically the same, only more elegantly (a little short note to my fellow Arch users: I had to modify the systemd unit file as described in the comments on the <a href="https://aur.archlinux.org/packages/hd-idle/">AUR package page</a> to get it to work). One thing I noticed though: after switching the hard drive off and on again by hand, you sometimes seem to have to restart <code>hd-idle</code>.</p>

<h2>Downloads:</h2>
<ul>
    <li><a href="/assets/shell-scripts/rdiff-backup-scripts.tar.bz2">rdiff-backup-scripts</a></li>
</ul>      <div><img src="//www.refining-linux.org/stat/piwik.php?idsite=3&amp;rec=1&amp;action_name=Performing%20push%20backups%20%E2%80%93%20Part%201%3A%20rdiff-backup" height="1" width="1" alt=""></div>
    ]]>
    </description>

    <dc:publisher>Refining Linux</dc:publisher>
    <dc:creator>nospam@example.com (Janek Bevendorff)</dc:creator>
    <dc:subject>
    Backup solutions, Userland, </dc:subject>
    <dc:date>2013-03-26T15:14:00Z</dc:date>
    <wfw:comment>http://www.refining-linux.org/wfwcomment.php?cid=65</wfw:comment>
        <slash:comments>0</slash:comments>
        <wfw:commentRss>http://www.refining-linux.org/rss.php?version=1.0&amp;type=comments&amp;cid=65</wfw:commentRss>
    
    <dc:subject>backup</dc:subject>
<dc:subject>nas</dc:subject>
<dc:subject>rdiff-backup</dc:subject>
<dc:subject>rsnapshot</dc:subject>
<dc:subject>server</dc:subject>

</item>
<item rdf:about="http://www.refining-linux.org/archives/64/guid/">
    <title>Programmatically limit CPU usage of certain processes</title>
    <link>http://www.refining-linux.org/archives/64/Programmatically-limit-CPU-usage-of-certain-processes/</link>
    <description>
    <![CDATA[
      <p>As some of my readers might know, I'm a committed KDE user. I love the freedom this desktop environment provides me with and its sheer versatility. Other people might think differently about that matter, but that's my opinion.

<p>Even more of you might know that KDE has a semantic desktop implementation called <a href="http://nepomuk.kde.org/">Nepomuk</a> and some might agree that, although it's a great thing in general, it has very often caused a lot of issues in the past. Especially the whole file indexing engine was very unstable and even unusable for many people, including me.

<p>Now with KDE 4.10 the file indexer has undergone some major changes which made it pretty usable so I decided to switch it on again. It turned out that the first stage indexing works exceptionally well. It indexed about 60,000 files in my home directory in the blink of an eye.

<p>Unfortunately, I had to realize that the second level indexing does not work so well. I remember Virtuoso often eating up all my CPU in the past. Now Virtuoso keeps quiet, but <code>nepomukindexer</code> let's my workstation fly. It only starts indexing when my PC is idle, but for bigger files it keeps the CPU busy at a level of 100%, which is a pretty bad thing. There is already a <a href="https://bugs.kde.org/show_bug.cgi?id=316075">Bug report</a> about <code>nepomukindexer</code> consuming too much CPU time on larger files, but I didn't want to wait for a fix.

<p>Long story short: I thought of ways to automatically limit the CPU usage of certain processes (not necessarily only Nepomuk).

<p>To accomplish this there is actually a pretty neat tool out there called <a href="https://github.com/opsengine/cpulimit">cpulimit</a>. It does exactly what I need. I give it a PID and a percentage value and it limits the CPU time available for that process. Unfortunately, it only works for processes that are currently running on the system and Nepomuk spawns a new process every time it starts the indexing process.

<p>Therefore I needed a script that runs permanently in the background and automatically limits the CPU usage if necessary. After googling for some time I found a few, but they were more or less overkill for what I needed. So I decided to write my own. It's a dirty hack, but it works.

<pre class="sourcecode"><code class="lang-zsh">#!/bin/zsh

MAXIMUM_ALLOWED_CPU=10
LIMIT_THRESHOLD=80
INTERVAL=10
PROCESSES_TO_WATCH=(nepomukindexer)

while sleep $INTERVAL; do
    for process_name in $PROCESSES_TO_WATCH; do
        pid_list=($(pgrep $process_name))
        
        # Don't do anything if no running process found
        [ ${#pid_list[@]} -le 0 ] && continue
        
        for pid in $pid_list; do
            process_info=$(top -b -n 1 -p $pid | tail -n 1)
            
            # Prevent race condition
            echo $process_info | grep -q "^\s*${pid} " || continue
            
            typeset -i cpu_usage=$(echo $process_info | sed 's/ \+/ /g' | sed 's/^ \+//' | cut -d ' ' -f 9)
            
            if [ $cpu_usage -gt $LIMIT_THRESHOLD ]; then
                echo "Limiting CPU usage of process $pid ($process_name)..."
                cpulimit -p $pid -l $MAXIMUM_ALLOWED_CPU &amp;
            fi
        done
    done
done</code></pre>

<p>I thought I'd share it here, maybe someone of you finds it useful too. The variable <code>$PROCESSES_TO_WATCH</code> is an array of all program names this script should watch.

<p>Once started, the script looks every <code>$INTERVAL</code> seconds for new processes in the list being active on the CPU for more than <code>$LIMIT_THRESHOLD</code> percent of the time and limits them to <code>$MAXIMUM_ALLOWED_CPU</code> percent.

<p>I start this script automatically when I log into KDE to bring Nepomuk to terms when it starts turning my PC into a central heating system again.

<h2> Update Mar 5th, 7:10 P.M. UTC+1:</h2>
<p>I was asked why I don't use <a href="https://www.kernel.org/doc/Documentation/cgroups/">Control Groups</a>, which are in fact the proper way of limiting resources for specific tasks on Linux. Well, the elegant answer would be that cgroups only work on Linux, but not on other *nix systems like, e.g., BSD. But okay, that doesn't really count since this is a Linux blog.

<p>The real answer is that cpulimit is just dead-simple and every user can run the above script. Cgroups, however, require root privileges and probably some deeper understanding of how Linux processes work in general. The cpulimit script can easily be started automatically with your desktop session and all the stuff that comes with it is strictly local. The price is a larger overhead, but as long as you don't use it for too many processes and with too short sleep intervals, you shouldn't notice it. If you need a general solution for coordinating resource usage, though, you should of course use cgroups instead.

<h2>Update Mar 7th, 5:00 P.M. UTC+1</h2>
<p>I updated the script slightly to prevent race conditions which lead to script termination.      <div><img src="//www.refining-linux.org/stat/piwik.php?idsite=3&amp;rec=1&amp;action_name=Programmatically%20limit%20CPU%20usage%20of%20certain%20processes" height="1" width="1" alt=""></div>
    ]]>
    </description>

    <dc:publisher>Refining Linux</dc:publisher>
    <dc:creator>nospam@example.com (Janek Bevendorff)</dc:creator>
    <dc:subject>
    KDE, Tools, Userland, </dc:subject>
    <dc:date>2013-03-05T14:27:00Z</dc:date>
    <wfw:comment>http://www.refining-linux.org/wfwcomment.php?cid=64</wfw:comment>
        <slash:comments>0</slash:comments>
        <wfw:commentRss>http://www.refining-linux.org/rss.php?version=1.0&amp;type=comments&amp;cid=64</wfw:commentRss>
    
    <dc:subject>cpulimit</dc:subject>
<dc:subject>kde</dc:subject>
<dc:subject>process</dc:subject>
<dc:subject>shell</dc:subject>
<dc:subject>tool</dc:subject>

</item>
<item rdf:about="http://www.refining-linux.org/archives/63/guid/">
    <title>Goodbye Feedburner - please check your subscription</title>
    <link>http://www.refining-linux.org/archives/63/Goodbye-Feedburner-please-check-your-subscription/</link>
    <description>
    <![CDATA[
      <p>Google is <a href="https://developers.google.com/feedburner/">shutting down their Feedburner API</a> and I am shutting down Feedburner and will continue with my own self-hosted feed from now on.</p>

<p>Those of you who have been using the RSS feed don't have to do anything, but as an Atom feed subscriber this means you should check your subscription (most people use the Atom feed). Please make sure the feed URL you have saved to your feed reader is <a href="http://www.refining-linux.org/feeds/atom10.xml/">http://www.refining-linux.org/feeds/atom10.xml/</a> (you may also use HTTPS). Until now that URL was redirecting to <em>http://feeds.feedburner.com/RefiningLinux</em> which has now been deprecated.

<p>The old Feedburner URL will redirect to my Atom feed for the next 30 days and then deliver an error 404, so please make sure you update your subscription within that time.

<p>Thank you and please accept my apologies for the trouble.      <div><img src="//www.refining-linux.org/stat/piwik.php?idsite=3&amp;rec=1&amp;action_name=Goodbye%20Feedburner%20-%20please%20check%20your%20subscription" height="1" width="1" alt=""></div>
    ]]>
    </description>

    <dc:publisher>Refining Linux</dc:publisher>
    <dc:creator>nospam@example.com (Janek Bevendorff)</dc:creator>
    <dc:subject>
    </dc:subject>
    <dc:date>2012-09-23T14:34:13Z</dc:date>
    <wfw:comment>http://www.refining-linux.org/wfwcomment.php?cid=63</wfw:comment>
        <slash:comments>0</slash:comments>
        <wfw:commentRss>http://www.refining-linux.org/rss.php?version=1.0&amp;type=comments&amp;cid=63</wfw:commentRss>
    
    
</item>
<item rdf:about="http://www.refining-linux.org/archives/62/guid/">
    <title>Hacking embedded systems</title>
    <link>http://www.refining-linux.org/archives/62/Hacking-embedded-systems/</link>
    <description>
    <![CDATA[
      <p>Linux is everywhere, not just on desktops. It's on phones, ebook readers, on public terminals, on routers, on electricity meters and many more devices. The key to Linux' success is it's diversity. It is possible to run Linux on nearly every technical device that has a CPU. Many of these are closed systems, so often you don't even notice that Linux is running on that particular device, but there is always a way to gain access to its internals and modify it the way you want. But often you have the problem that heavy modifications might void your warranty or make updates to a more recent firmware version impossible. In this article I want to show you a simple but powerful way to modify such systems in a non-destructive way.

<p>The device I'll be using for reference throughout this article is my wireless network router (a FRITZ!Box 7390), but the principles apply to nearly any embedded Linux system. I assume you already have gained root shell access to your device. If you haven't, search for ways to get a shell before you continue. My FRITZ!Box has a built-in telnet server which I can use for that, but your system may be different.

<h2>Analysis: what have we here?</h2>
<p>The first thing we need to do is to find out some information about the system architecture, the Linux version and the file system structure. Every system is different in this respect, so an analysis is mandatory before continuing with any modification.

<h3>Find out the kernel version and architecture</h3>
<p>In order to be able to compile programs for our embedded device we need to find out the processor architecture. Only very few embedded devices use a normal x86 or AMD64 architecture. Instead most system rely on more power saving but less powerful architectures such as ARM or MIPS. Finding out what architecture we have is easy. Just run the command

<pre class="sourcecode"><code class="language-bash">uname -nsrm</code></pre>

<p>This prints the OS name, the kernel version and the processor architecture. In my case the kernel version is 2.6.28.10 and the architecture MIPS.

<h3>Determine system endianness</h3>
<p>The next thing we need to find out is the byte order of the system, i.e. the system endianness. A system is either big or little endian which means that the most important byte is either the first or the last in order. I always found endianness a little hard to understand, but it is actually quite simple because there is a big analogy to human speech. When we write and read numbers in the English language, we use a big endian format, i.e. the most significant number goes first. That means the number 356 is spoken "three hundred and fifty six", but if English was little endian, it would instead be "six hundred and fifty three" because the <em>least</em> significant number comes first. You see: defining endianness is very important and binaries compiled for the wrong endianness won't be able to run and print some weird error messages of some unexpected symbols or characters.

<p>But how do we test which endianness we have? Well, it's not that simple. You won't find the information somewhere in a /proc file and also not (necessarily) in the <code>uname</code> output (however, if you see something like "mips<em>el</em>" in the OS name, you have a little endian MIPS architecture). Therefore we must find another way. A clever test I found on <a href="http://serverfault.com/questions/163487/linux-how-to-tell-if-system-is-big-endian-or-little-endian">serverfault.com</a> is this:

<pre class="sourcecode"><code class="language-bash">echo -n I | od -to2 | head -n1 | cut -f2 -d" " | cut -c6</code></pre>

<p>It creates some two byte octal numbers and analyzes which byte comes first.  So 0 means big endian, 1 means little endian. When you run it on your desktop computer you will most probably get a 1 since the Intel x86 architecture as well as the AMD64 architecture are both little endian. But many embedded systems are big endian. Also most network traffic (such as TCP/IP) is big endian so that might be one reason for manufacturers of embedded systems to choose big endian. But the problem with the snippet above is that it might not be able to run on your system because often there is only a very simple Busybox environment with a very simple Ash shell which doesn't know the commands <code>od</code> or <code>head</code>. Another way would be to compile a little C program which does some int/char conversion to find out the endianness (you find many code examples on the Internet) but because you would most probably need to cross compile it anyway (and therefore already know the endianness) I would suggest a third method: just try. Follow the next section by assuming you have a big endian system, if that doesn't work it should be little endian.

<h2>Building the binaries</h2>
<p>Now the fun part comes: compiling the stuff you want to have on your embedded device. Normally you would need to build all the binaries on the target machine, but those are mostly too weak and hardly ever have a working build environment. So we need to cross compile our stuff. But because building a working cross compiling environment is a though job, there are some ready-to-use tools for it. One of them is BuildRoot. BuildRoot is a complete cross compiling environment with lots of frequently used applications already included. In most cases you only need to select which package you want to build and then run <code>make</code>. To use BuildRoot, first <a href="http://buildroot.uclibc.org/download.html">download it</a> and then extract it somewhere on the hard drive of your working machine (not the embedded system).

<p>Next find out the uClibc version of your embedded system (uClibc is a lightweight replacement for glibc) by running

<pre class="sourcecode"><code class="language-bash">ls /lib/libuClibc-*</code></pre>

<p>on that machine. Mine is <em>0.9.32</em> (in case the system does not use uClibc, you need to <a href="http://buildroot.uclibc.org/buildroot.html#external_toolchain">use an external toolchain</a> in order to successfully build binaries for your target platform).

<p>Once you have extracted everything, grab a terminal and navigate to that directory. Then run <code>make menuconfig</code>. In <code>Target Architecture</code> set the architecture. For MIPS with big endian select <code>mips</code>, for MIPS with little endian select <code>mipsel</code> (arm, i368, x86_64, powerpc etc should be self-explanatory). If you have a special architecture variant, set it in <code>Target Architecture Variant</code>. For my system <code>mips 32r2</code> is just fine.

<p>The next thing you need to do is to set the toolchain. In my case I set <code>Toolchain ---&gt; uClibc C library</code> version to <code>uClibc 0.9.32.x</code>. If your uClibc version differs, change the value there. In case you use an external tool chain, you have to set the correct version of your particular C library. Also set the kernel headers to a version appropriate to your target device's kernel version (but it doesn't have to be exact, often the newest kernel headers also work just fine for older kernels)

<p>If you want to tweak the settings a bit further, you can do that. Otherwise just select the packages you want to build under <code>Package Selection for the target</code> and then exit and save. Next thing to do is to run <code>make</code> and grab some coffee.

<p><em>Important!</em> If you want to speed up the build process by using multiple make jobs, you cannot do that by using the <code>-j</code> parameter. Instead you have to change that setting in <code>Build options ---&gt; Number of jobs to run simultaneously</code>.

<p>Once it's all built, you find the binaries in <code>output/target</code>.

<h2>Setting up the overlay</h2>
<p>Now that we have built the binaries we need to get them onto our embedded system. But that is not as easy as it seems. You could of course just put them on some connected hard drive or some internal writeable memory and start them from there, but often you need to integrate things into the system's root file system and that's what we're going to do now.

<p>The method we're using is inspired by a great blog post about <a href="http://www.64k-tec.de/2011/07/fritzbox-tuning-part-4-cross-building-and-installing-additional-applications/">installing additional software on a FRITZ!Box</a>. I developed the method a little further to automate it and made a huge script out of it which you can download at the end of this article.

<p>These are the things we need to do:

<ol>
<li>Order the binaries we want to integrate in a file system structure similar to the root file system.</li>
<li>Bind a second instance of the root file system to a writeable directory.</li>
<li>Create a second directory on the same writeable partition and mount a tmpfs to it</li>
<li>Recursively symlink chosen directories from the alternate rootfs we created in step 2 and our files from step 1 to that tmpfs. We have to be careful with files that already exist in the alternate rootfs and need to choose which version we want to use: ours or the original one.</li>
<li>Bind the directories from the symlink overlay to the respective directories in <code>/</code></li>
<li>Enjoy the modified system</li>
</ol>

<h3>Step 1: Create a directory structure similar to the rootfs</h3>
<p>This step is more or less optional, but it helps automating the whole process. What's meant by this is that we put all our binaries (and other files) we want to install into directories which mirror the file system structure of the root file system. For instance, to integrate Dropbear SSH into <code>/usr/sbin</code>, we put the executable in a directory <code>usr/sbin/</code> somewhere on a hard drive readable by the embedded system. So in the end we have the same structure as the rootfs, but only with the files we want to integrate.
<p>For my FRITZ!Box I put the files in a directory on the internal dedicated data partition that is accessible via FTP. This partition is mounted to <code>/var/media/ftp</code>. The directory containing my file system structure I called <code>/var/media/ftp/fboxmod/rootfs</code>.

<h3>Step 2: Bind a second instance of the rootfs to some other location</h3>
<p>This step is easy. We only need to create a directory somewhere on a readable and writeable file system and bind <code>/</code> to there. On my FRITZ!Box I'm using <code>/var/_altrootfs</code> as the location for this alternate rootfs where <code>/var</code> is a temporary writeable file system (tmpfs). I suggest you also prefer a temporary file system over a persistent one since it makes no actual modification to the system. Alternatively you can also use an external hard drive that is connected to your device.

<p>To bind the rootfs to the new location run:

<pre class="sourcecode"><code class="language-bash">mkdir /var/_altrootfs
mount -o bind / /var/_altrootfs</code></pre>

<p>(of course you must change the paths as needed)

<h3>Step 3: Create the tmpfs for the overlay</h3>
<p>For the actual overlay we create a second directory on the same file system we used for binding the alternate rootfs. I called mine <code>/var/_fboxmod-overlay</code>:

<pre class="sourcecode"><code class="language-bash">mkdir /var/_fboxmod-overlay
mount -t tmpfs tmpfs /var/_fboxmod-overlay</code></pre>

<h3>Step 4: Recursively symlink rootfs and custom files to the tmpfs</h3>
<p>This is probably the most complex step as we need to symlink each single file to our new overlay directory. What we basically do is to walk through all files recursively and check for each one whether it is an actual file or a directory. If it is a directory we create another one with the same name in our overlay directory. If it is an actual file we create a symlink instead. We need to do this for the directories we want to modify in our alternate rootfs and our custom files we want to integrate.

<p>To make this process a little easier I have written a little shell function for it:

<pre class="sourcecode"><code class="language-bash">symlink_dir_recursively() {
    local src_dir="${1}"
    local dest_dir="${2}"
    local dir_contents=$(ls -A "${src_dir}")
    
    mkdir -p ${dest_dir}
    
    for i in ${dir_contents}; do
        # Check if file exists
        if [ -e "${dest_dir}/${i}" ] &amp;&amp; [ ! -d "${dest_dir}/${i}" ]; then
            if ! $FORCE_OVERWRITE; then
                continue
            fi
        fi
        
        if [ -h "${src_dir}/${i}" ]; then
            cp -d "${src_dir}/${i}" "${dest_dir}/${i}"
        elif [ -d "${src_dir}/${i}" ]; then
            symlink_dir_recursively "${src_dir}/${i}" "${dest_dir}/${i}"
        elif [ -f "${src_dir}/${i}" ]; then
            ln -sf "${src_dir}/${i}" "${dest_dir}/${i}"
        fi
    done
}</code></pre>

<p>With the variable <code>$FORCE_OVERWRITE</code> you can define whether you want existing files to be overridden or not. Run this function for the alternate rootfs and all our custom files:

<pre class="sourcecode"><code class="language-bash"># The directories for which we want to create an overlay
dirs_to_overlay="bin etc lib sbin usr/bin usr/sbin usr/lib usr/share var/tmp/root"

# Symlink chosen dirs from alternate root file system to overlay
for i in ${dirs_to_overlay}; do
    if [ -e "/var/_altrootfs/${i}" ]; then
        symlink_dir_recursively "/var/_altrootfs/${i}" "/var/_fboxmod-overlay/${i}"
    fi
done

# Symlink mod file system to overlay
overlay_dirs=$(ls -A "/var/media/ftp/fboxmod/rootfs")
for i in $overlay_dirs; do
    symlink_dir_recursively "/media/ftp/fboxmod/rootfs/${i}" "/var/_fboxmod-overlay/${i}"
done</code></pre>

<p>The reason why we're not just symlink <code>/</code> and <code>/var/media/ftp/fboxmod/rootfs</code> is that we can't just bind <code>/var/_fboxmod-overlay</code> to <code>/</code> in the next step without screwing our system (we would then need to pull the plug since the shell would be completely dead saying "Too many levels of symbolic links") and therefore we also don't need to symlink <em>everything</em>. Better focus on those directories we really need to modify and leave the rest alone. You can even shrink down the <code>$dirs_to_overlay</code> list above and only have the directories there you really want to overlay.

<h3>Step 5: Bind the overlay directories to the respective directories in /</h3>
<p>The last thing we need to do is to apply the overlay. We now have a symlink directory structure in <code>/var/_fboxmod-overlay</code> consisting of the original files from <code>/var/_altrootfs</code> and our own files from <code>/var/media/ftp/fboxmod/rootfs</code>. Let's apply those to the rootfs.

<p>In order to do this we are again using the <code>bind</code> option of the <code>mount</code> command. We simply do a <code>mount -o bind</code> for each top-level directory in our overlay to the respective directory in <code>/</code>. Again: <strong>do NOT just bind <code>/var/_fboxmod-overlay</code> to <code>/</code>!</strong>. It will kill the whole system.

<pre class="sourcecode"><code class="language-bash"># Do this for all directories we defined before
for i in ${dirs_to_overlay}; do
	# Skip non-existing mount points in the rootfs
	# Replace this with the part I commented out below if you
	# want to create them instead (only recommended on non-persistent file systems!)
	if [ ! -e "/${i}" ]; then
		continue
	fi
	#if [ ! -e "/${i}" ]
	#	mkdir  "/${i}" 2> /dev/null || (echo "Creation failed, skipping" &amp;&amp; continue)
	#fi
	
	mount -o bind "/var/_fboxmod-overlay/${i}" "/${i}"
done</code></pre>

<h3>Step 6: Enjoy!</h3>
<p>That's it, we're done. You have modified your embedded system in a completely non-destructive way. Once you reboot it, everything is gone and the system is the same as before (except maybe some mount points you created on non-temporary file systems). And the best thing: this method also works for read-only root file systems.

<p>I used this method to install an SSH daemon and some custom scripts on my FRITZ!Box, but since it is quite a lot of work to do and I don't want to spend so much time on it each time I need to reboot the router (happens not that often, but it happens), I wrote a script for it. You can download it from <a href="/assets/shell-scripts/fbox-hacking/init.sh">here</a>. The script is quite complex but also quite flexible. What you basically need to change are the paths defined in the beginning:

<pre class="sourcecode"><code class="language-bash">OVERLAY_DIR="/var/_fboxmod-overlay"
ALTERNATE_ROOT_DIR="/var/_altrootfs"
MOD_DIR="/var/media/ftp/fboxmod/rootfs"
DIRS_TO_OVERLAY="bin etc lib sbin usr/bin usr/sbin usr/lib usr/share var/tmp/root"</code></pre>

<p>You can also change all the other config variables following them, but you don't need to. You can set them dynamically using command line arguments. To view a list of these either look into the source code or run

<pre class="sourcecode"><code class="language-bash">./init.sh --help</code></pre>

<p>The script is also able to revert the changes it made at runtime without the need to reboot. Simply run

<pre class="sourcecode"><code class="language-bash">./init.sh --revert</code></pre>

<p>and your system is clean again.

<p>I hope, you found this long blog post useful and as said above: enjoy! <img src="/templates/reflinux-2012/img/emoticons/smile.png" alt=":-)" style="display: inline; vertical-align: bottom;" class="emoticon" />      <div><img src="//www.refining-linux.org/stat/piwik.php?idsite=3&amp;rec=1&amp;action_name=Hacking%20embedded%20systems" height="1" width="1" alt=""></div>
    ]]>
    </description>

    <dc:publisher>Refining Linux</dc:publisher>
    <dc:creator>nospam@example.com (Janek Bevendorff)</dc:creator>
    <dc:subject>
    Embedded systems, Shell tricks, Userland, </dc:subject>
    <dc:date>2012-05-30T18:16:25Z</dc:date>
    <wfw:comment>http://www.refining-linux.org/wfwcomment.php?cid=62</wfw:comment>
        <slash:comments>1</slash:comments>
        <wfw:commentRss>http://www.refining-linux.org/rss.php?version=1.0&amp;type=comments&amp;cid=62</wfw:commentRss>
    
    <dc:subject>embedded system</dc:subject>
<dc:subject>file system</dc:subject>
<dc:subject>mips</dc:subject>
<dc:subject>shell</dc:subject>

</item>
<item rdf:about="http://www.refining-linux.org/archives/61/guid/">
    <title>A new appearance</title>
    <link>http://www.refining-linux.org/archives/61/A-new-appearance/</link>
    <description>
    <![CDATA[
      <p>Yay, Refining Linux got a face-lift!

<p>This blog has now been up for a good one and a half year and nothing has changed much since it started. Now it's time to give it a redesign (if you ask me, this was long overdue). While the main appearance stays the same, the details have changed significantly. Let me walk you through the new goodies.

<h2>Responsive design</h2>
<p>The most awesome feature first: the whole design is now completely responsive and can be viewed at any size. The old desktop-only design has been replaced by a brand-new flexible design for any kind of device, be it a small smartphone, a bigger tablet or a giant 30" display. Refining Linux has it all.

<p>If you like, you can test this. Just resize your browser window and see how great the new design adapts to the new width. The minimum size I consider looking well is about 240 pixels, but there is no upper limit. It just might start looking a bit ridiculous if you project a tiny page onto your 5k display wall. <img src="/templates/reflinux-2012/img/emoticons/wink.png" alt=";-)" style="display: inline; vertical-align: bottom;" class="emoticon" />

<figure class="thumbnail">
<a href="../../../uploads/content-img/redesign-00-responsive-design-fs8.png"><img src="../../../uploads/content-img/redesign-00-responsive-design-fs8.thumb.png" width="580" height="373" alt="The new layout looks great on both large and very small screens such as mobile browsers."></a>
</figure>

<h2>Nicer typography</h2>
<p>Although Georgia and Verdana are not the worst choices I made in my life, they don't look very pretty. They're old veterans of the web-safe fonts battalion and have gotten a bit long in the tooth. Nothing against these fonts in general, but it was time for something different. I mean, we live in the age of web fonts, don't we?

<p>The new choices are <a href="http://new.myfonts.com/fonts/open-window/clarendon-paint/"><i>Clarendon Paint</i></a> and <a href="http://www.fontsquirrel.com/fonts/Aller"><i>Aller</i></a>, two fresh fonts for the new look and feel of Refining Linux. I like both and think they support the overall painted appearance.

<figure class="thumbnail">
<a href="../../../uploads/content-img/redesign-01-typography-fs8.png"><img src="../../../uploads/content-img/redesign-01-typography-fs8.thumb.png" width="580" height="373" alt="The Refining Linux typography has changed to a heading font with a painted look and a cleaner body text font."></a>
</figure>

<h2>Improved Header</h2>
<p>Not only the body has been refurbished, also the header has been pimped. It was actually the last thing I took care of, but that doesn't mean it's not important. It is. Our little painting Tux has become a bit more lifelike, the background painting a bit more realistic. I also took care of the main menu. It got some more detail, less straight edges and some subtle CSS gradients for the hover effect.

<p>In general I haven't modified much in the header, but again, the details have changed. It's perfectly possible that some things in the header might still be tweaked from time to time, but for now I leave it as that.

<figure class="thumbnail">
<a href="../../../uploads/content-img/redesign-02-revamped-header-fs8.png"><img src="../../../uploads/content-img/redesign-02-revamped-header-fs8.thumb.png" width="580" height="373" alt="The Tux header image has got some more fine detail as well as the navigation links and the header background."></a>
</figure>

<h2>Redesigned comments section</h2>
<p>The comments section has especially been taken care of. It looks very different now and much better in my opinion. The speech bubbles make a lot more sense this way and the overall look is much cleaner and more pleasing to the eye. I hope you like it as well.

<p>The amount of indentation has grown a bit with the redesign, so I have limited the maximum number of indents to four (plus the root level, i.e. five). Comments already being submitted at a higher nesting level retain their position in the database but their display is linearized at the fifth level.

<p>Also the comments form (and the contact form as well) has been redesigned slightly. It also looks a lot cleaner now and I put in some nice HTML5 form validation features there (more on that in a second).

<figure class="thumbnail">
<a href="../../../uploads/content-img/redesign-03-improved-comments-fs8.png"><img src="../../../uploads/content-img/redesign-03-improved-comments-fs8.thumb.png" width="580" height="373" alt="The comments section has been redesigned completely, avatars are now next to the comments with speech bubble beaks pointing to them. Also little shadows have been added to give the comments more depth."></a>
</figure>

<h2>Better syntax highlighting</h2>
<p>The syntax highlighting has much been improved. Instead of <a href="http://shjs.sourceforge.net/"><i>SH_JS</i></a> I'm now using <a href="http://softwaremaniacs.org/soft/highlight/en/">Highlight.js</a> which gives me way more accurate highlighting and particularly more control. Especially for the last Advent series I created a special ZSH highlighting scheme. I couldn't really do that in <i>SH_JS</i> without having to learn <a href="http://www.gnu.org/s/src-highlite/"><i>GNU Source Highlight</i></a> (I'm too lazy for that).

<p>The new color scheme for the syntax highlighting is based on Ethan Schoonover's famous <a href="http://ethanschoonover.com/solarized">Solarized theme</a>. Personally I don't like Solarized too much for my editor (the colors are too muted to my taste), but for the website, I believe, the bright theme is great.

<p>Yes, there are still some glitches here and there in the highlighting, but I guess I will be able to fix them over time. The highlighting as it is now is already tremendously better than it was ever before.

<figure class="thumbnail">
<a href="../../../uploads/content-img/redesign-04-better-highlighting-fs8.png"><img src="../../../uploads/content-img/redesign-04-better-highlighting-fs8.thumb.png" width="580" height="373" alt="The syntax highlighting has got a more suble and muted background color, a nicer color scheme as such and much more precise highlighting rules."></a>
</figure>

<h2>HTML5 rewrite</h2>
<p>The whole page has been rewritten in HTML5. This wasn't necessary, but I did it in the process of making the whole thing responsive. It now uses semantic HTML5 markup and of course also goodies like HTML5 input type validation for the comments form, a new Canvas tag cloud (replacing the old and ugly Flash tag cloud) and much more.

<p>The tag cloud, by the way, should be completely accessible as it is based on a normal list of links. Users with very small screens, touch based devices or disabled JavaScript will get that version instead. Also screen readers should be able to access the fallback version.

<figure class="thumbnail">
<a href="../../../uploads/content-img/redesign-05-html5-rewrite-fs8.png"><img src="../../../uploads/content-img/redesign-05-html5-rewrite-fs8.thumb.png" width="580" height="373" alt="The HTML code now consists of semantic HTML5 elements instead of a large div soup."></a>
</figure>

<h2>Better legacy IE support</h2>
<p>Although I don't care much, a nice side-effect of the responsive design and its “mobile-first” approach is that the support for IE6, 7 and 8 is much better now. IE9 renders everything fine and just lacks support of some minor things such as gradients without SVG. IE8 and older, however, does horrible things to the layout, especially with the new unknown HTML5 elements which I can only get to work with JavaScript. But with the mobile fallback and a few minor fixes I got a more or less decent but at least working version for old IEs. With JavaScript enabled it looks a little less horrible, with JavaScript turned off a little more. But overall it's all working and the two IE users I have will get a very rough but usable basic version of the website (actually, it's a bit more. I have about 7-8% of IE users here of which about a quarter uses IE 9 or above. In December the IE rate was even as low as 5.7%).

<figure class="thumbnail">
<a href="../../../uploads/content-img/redesign-06-better-ie-support-fs8.png"><img src="../../../uploads/content-img/redesign-06-better-ie-support-fs8.thumb.png" width="580" height="373" alt="Users of old IEs still get a big warning that their browser is outdated, but the overall visual experience has improved a little in those browsers."></a>
</figure>

<h2>Enjoy!</h2>
<p>I hope you enjoy the redesign as much as I do. If you like, you can help me a little by doing some browser testing. Although I have tested this on Linux and Windows in many browsers, there might still be some glitches here and there. If you find some, please let me know.

<p>Have fun! <img src="/templates/reflinux-2012/img/emoticons/smile.png" alt=":-)" style="display: inline; vertical-align: bottom;" class="emoticon" />      <div><img src="//www.refining-linux.org/stat/piwik.php?idsite=3&amp;rec=1&amp;action_name=A%20new%20appearance" height="1" width="1" alt=""></div>
    ]]>
    </description>

    <dc:publisher>Refining Linux</dc:publisher>
    <dc:creator>nospam@example.com (Janek Bevendorff)</dc:creator>
    <dc:subject>
    </dc:subject>
    <dc:date>2012-01-20T01:08:00Z</dc:date>
    <wfw:comment>http://www.refining-linux.org/wfwcomment.php?cid=61</wfw:comment>
        <slash:comments>2</slash:comments>
        <wfw:commentRss>http://www.refining-linux.org/rss.php?version=1.0&amp;type=comments&amp;cid=61</wfw:commentRss>
    
    
</item>
<item rdf:about="http://www.refining-linux.org/archives/60/guid/">
    <title>SOPA blackout</title>
    <link>http://www.refining-linux.org/archives/60/SOPA-blackout/</link>
    <description>
    <![CDATA[
      <p>Tomorrow this blog will be blacked out for 12 hours starting at 1400 CET (1300 UTC or 8 AM EST).
<p>With this initiative Refining Linux is following the <a href="http://americancensorship.org/">protests</a> against the <b>Stop Online Piracy Act</b> (SOPA) and the <b>PROTECT IP Act</b> (PIPA) proposed by US legislators and the media industry. Many huge Internet companies and organizations such as <a href="http://blog.reddit.com/2012/01/stopped-they-must-be-on-this-all.html">Reddit</a> and <a href="http://wikimediafoundation.org/wiki/English_Wikipedia_anti-SOPA_blackout">Wikipedia</a>  participate in these protests. Also companies such as Google, Amazon, Facebook and of course non-profit organizations such as Mozilla and many smaller groups support the protests against SOPA and PIPA.

<h2>Why is this so important?</h2>
<p>These two bills have the goal to give law enforcement agencies more power to fight against “rogue websites” and copyright infringements in a way that highly endangers free speech and open communication infrastructure. As supporters of open source initiatives and democratic processes we have to intervene and prevent these bills from becoming applicable law.

<p>Fortunately at least the DNS filtering parts of both these bills <a href="http://www.eweek.com/c/a/Security/White-House-Opposes-DNS-Blocking-in-SOPA-314876/">have been suspended for now</a> due to massive protest from all over the country and around the world, but many evenly dangerous parts persists and the idea behind the whole proposed law isn't dead at all. In fact, it is very much alive and similar attempts are being made behind closed doors in other countries as well. One example is the <a href="http://www.stopp-acta.info/english">Anti-Counterfeiting Trade Agreement</a> (ACTA) which is currently negotiated by many countries in the European Union.

<p>Therefore this is not just a matter of US citizens, it is a matter of people around the world.

<h2>Why are so many big companies against SOPA and PIPA?</h2>
<p>The simple question is: because the Internet is their business. Other than the entertainment industry, which mainly supports SOPA, these companies depend on the Internet. Should these bills become law, they could be held responsible for any third party content appearing on their websites. That means in an extreme case a single search result pointing to an illegal website could be reason enough to sanction Google. The same applies to any other website with user contributed content such as Wikipedia, Reddit, Facebook, any online community, even your private web forum. All posts made to these website had to be checked and approved manually before they appear publicly. That is impossible to do and a massive restraint of free speech and open communication.

<p>The same applies to ACTA and any similar proposed agreement that abuses legal forces to destroy the freedom of people in the name of fighting online delicts.

<h2>You should participate as well</h2>
<p>Refining Linux protests for a free Internet and so should you. As Sue Gardener wrote in the Wikimedia's announcement to black out Wikipedia (which I linked above):

<blockquote>“The reality is that we don’t think SOPA is going away, and PIPA is still quite active. Moreover, SOPA and PIPA are just indicators of a much broader problem. All around the world, we're seeing the development of legislation intended to fight online piracy, and regulate the Internet in other ways, that hurt online freedoms. Our concern extends beyond SOPA and PIPA: they are just part of the problem. We want the Internet to remain free and open, everywhere, for everyone.”</blockquote>

<p>This is absolutely true. Even if you're not a US citizen (like me) this affects you. Bills like SOPA, PIPA or ACTA must not pass legislation. They destroy open communication around the world and change the Internet in a way we really don't want it to be.

<p>So please sign the petitions against <a href="http://americancensorship.org/">SOPA</a> or <a href="https://www.accessnow.org/page/s/just-say-no-to-acta">ACTA</a> or call your representatives.

<p>Thank you<br>
Janek


<h2>Update 01/19/2012 0100 UTC:</h2>
<p>Refining Linux is back. Thanks to all who showed their support.      <div><img src="//www.refining-linux.org/stat/piwik.php?idsite=3&amp;rec=1&amp;action_name=SOPA%20blackout" height="1" width="1" alt=""></div>
    ]]>
    </description>

    <dc:publisher>Refining Linux</dc:publisher>
    <dc:creator>nospam@example.com (Janek Bevendorff)</dc:creator>
    <dc:subject>
    </dc:subject>
    <dc:date>2012-01-17T13:12:00Z</dc:date>
    <wfw:comment>http://www.refining-linux.org/wfwcomment.php?cid=60</wfw:comment>
        <slash:comments>0</slash:comments>
        <wfw:commentRss>http://www.refining-linux.org/rss.php?version=1.0&amp;type=comments&amp;cid=60</wfw:commentRss>
    
    
</item>
<item rdf:about="http://www.refining-linux.org/archives/59/guid/">
    <title>ZSH Gem #24: ZSH frameworks</title>
    <link>http://www.refining-linux.org/archives/59/ZSH-Gem-24-ZSH-frameworks/</link>
    <description>
    <![CDATA[
      <aside class="advent">This article is part of the 2011 Advent calendar series “24 Outstanding ZSH Gems”. Each day between December 1st and December 24th an article will be published as part of this series showing one awesome feature of the Z Shell. Some of the features can of course also be found in other shells such as Bash, but the ZSH implementation is often superior.</aside>

<p>I have shown you many things about ZSH throughout this series, but there is much more you can do with it than I could cover here. And of course there is also much more to configure, many more options I couldn't tell you about, many more tips and tricks, tweaks and optimizations.</p>

<p>Generally, it's a long way to go before you have your shell set up as you like. Especially ZSH needs a lot of configuration before it becomes very user-friendly. You can do all this configuration by hand or you can use a framework for that. Yes, there are frameworks for ZSH (and for Bash as well, in case you didn't know) and as a completion of this Advent series I'll show you two of them.</p>

<h2>oh-my-zsh</h2>
<p>Probably the most advanced and mighty ZSH framework is <a href="https://github.com/robbyrussell/oh-my-zsh"><em>oh-my-zsh</em></a>. oh-my-zsh provides lots of extra features, special plugins with additional functions and completion definitions for certain command line tools and many, many themes for modifying the appearance of your shell, particularly the prompt.</p>

<p>To use oh-my-zsh, download it to a directory of your choice (e.g. <code>~/.oh-my-zsh</code>) and load it from within your <code>.zshrc</code>:</p>

<pre class="sourcecode"><code class="language-zsh"># Path to your oh-my-zsh installation (the framework won't work without this setting)
ZSH=$HOME/.oh-my-zsh

# Load oh-my-zsh
source $ZSH/oh-my-zsh.sh
</code></pre>

<p>That's basically it. You've successfully loaded the framework. But until now it doesn't do much except changing the default prompt to something a bit more meaningful and configuring some basic stuff such as enabling <code>ls</code> colors. But you can customize the framework. For example to disable <code>ls</code> colors again, write the following in your <code>.zshrc</code> before the line where the framework is loaded:</p>

<pre class="sourcecode"><code class="language-zsh">DISABLE_LS_COLORS="true"</code></pre>

<p>To set another theme modify the parameter <code>$ZSH_THEME</code>:</p>

<pre class="sourcecode"><code class="language-zsh">ZSH_THEME="candy"</code></pre>

<p>Now when you re-source your current shell instance you have the <code>candy</code> theme activated. For a full list of available themes have a look at the <code>themes</code> folder inside your oh-my-zsh folder. An even better overview can be found in the <a href="https://github.com/robbyrussell/oh-my-zsh/wiki/Themes">oh-my-zsh wiki</a> on GitHub.</p>

<p>Next let's come to the plugins. By default, no plugins are loaded, but that can be changed. Similar to ZSH's <code>$fpath</code> array, oh-my-zsh has a special <code>$plugins</code> array which contains all the names of all the plugins to load when initializing the framework. For instance:</p>

<pre class="sourcecode"><code class="language-zsh">plugins=(git github perl svn)</code></pre>

<p>for loading the plugins <code>git</code>, <code>github</code>, <code>perl</code> and <code>svn</code> which will provide extra functions and completion features for the corresponding applications. For instance, the <code>git</code> plugins adds many advanced completion features for working with Git repositories and the shell function <code>current_branch</code> as a shorthand for displaying the branch you're currently working on without any additional meta data or formatting stuff.</p>

<p>oh-my-zsh is pretty mighty and you can have a lot of fun with it. Just be aware, that even though ZSH provides an <a href="/archives/46/ZSH-Gem-12-Autoloading-functions/">autoloading mechanism</a> for functions, loading too many plugins can slow down your shell. It can be very annoying when you always have to wait 10 seconds after opening a new shell instance.</p>

<p>If you're not quite sure how to start with oh-my-zsh, have a look at the example <code>.zshrc</code> in the <code>templates</code> folder.</p>

<h2>zshuery</h2>
<p><a href="https://github.com/myfreeweb/zshuery"><em>zshuery</em></a> is another framework for ZSH which is much smaller. It's more something like a micro-framework. The name is derived from <em>jQuery</em> and the idea behind it is to provide a simple and flexible yet fast framework. zshuery doesn't have lots of plugins. It's just one single file which provides some extra functionality and does the basic ZSH configuration for you. That's it. Compared to oh-my-zsh, zshuery is a stub but that's why I like this framework and prefer it over the more powerful oh-my-zsh. It's just what it wants to be: a simple framework which makes working on the shell easier without blowing you away with feature you probably never need. With some extra configuration in my <code>.zshrc</code> this framework is exactly what I need.</p>

<p>But first things first. Loading zshuery is pretty much straightforward. Just download the framework to an arbitrary folder and reference it in your <code>.zshrc</code>:</p>

<pre class="sourcecode"><code class="language-zsh"># Load zshuery
source ${HOME}/.zshuery/zshuery.sh</code></pre>

<p>That loads the framework. Now to let zshuery set some default options, aliases and autocorrection for you, call the following functions:</p>

<pre class="sourcecode"><code class="language-zsh">load_defaults
load_aliases
load_correction</code></pre>
<p>I told you that zshuery doesn't come with plugins and that is true. However, it provides an easy way to load additional completion functions from the <a href="https://github.com/zsh-users/zsh-completions"><code>zsh-completions</code></a> Git repository. These are autocompletion functions which might not yet be in the official ZSH release. To include them, clone the repository (preferably to a directory somwhere inside your zshuery folder) and call the following zshuery function:</p>

<pre class="sourcecode"><code class="language-zsh">load_completion ${HOME}/.zshuery/completion</code></pre>

<p>(provided the path to your copy of the <code>zsh-completion</code> repository is <code>~/.zshuery/completion</code>)</p>

<h2>And finally my Christmas gift for you…</h2>
<p>zshuery is a great framework for pimping your Z Shell without overloading it. And because it's my favorite ZSH framework and because Christmas is just around the corner, I give you a commented version of my <code>.zshrc</code> in which I use this framework with some slight modifications:</p>
<pre class="sourcecode"><code class="language-zsh"># Load zshuery
source ${HOME}/.zshuery/zshuery.sh
load_defaults
load_aliases
load_correction

# Colorize ls output
alias ls="ls --color=auto"

# Redefine ls colors and load completion definitions
export LS_COLORS="no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=01;32:*.cmd=01;32:*.exe=01;32:*.com=01;32:*.btm=01;32:*.bat=01;32:*.sh=01;32:*.csh=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.gz=01;31:*.bz2=01;31:*.bz=01;31:*.tz=01;31:*.rpm=01;31:*.cpio=01;31:*.jpg=01;35:*.gif=01;35:*.bmp=01;35:*.xbm=01;35:*.xpm=01;35:*.png=01;35:*.tif=01;35:"
load_completion ${HOME}/.zshuery/completion

# Modify key bindings to make certain keys such as DEL, HOME, END, PAGE UP and PAGE DOWN work
eval "$(sed -n 's/^/bindkey /; s/: / /p' /etc/inputrc)" > /dev/null
bindkey "\e[5~" beginning-of-history    # Page Up
bindkey "\e[6~" end-of-history          # Page Down

# Colorize STDERR (enable if needed, might cause issues with password prompts or escape sequences)
#exec 2>>(while read line; do; print '\e[91m'${(q)line}'\e[0m' > /dev/tty; print -n $'\0'; done)

# Enable menu select
zstyle ':completion:*' menu select

# Enable tree view for kill completion
zstyle ':completion:*:*:kill:*:processes' command 'ps --forest -e -o pid,user,tty,cmd'

# Modify zshuery correction prompt
SPROMPT="Correct $fg[red]%R$reset_color to $fg[green]%r?$reset_color (Yes, No, Abort, Edit) "

# Load fancy prompt (works on Gentoo Linux only)
autoload -U promptinit
promptinit
prompt gentoo

# Non-Gentoo users might configure their prompt manually.
# The default Gentoo prompt from above would look like this:
#PROMPT="%B%{$fg[green]%}%n@%m%k%{$reset_color%} %B%{$fg[blue]%}%1~ %# %b%f%k%{$reset_color%}"

# Set options
setopt complete_in_word
setopt path_dirs

# Load modules
zmodload zsh/regex
zmodload zsh/pcre

# Update terminal CWD once and then on every CWD change
update_terminal_cwd
function chpwd() {
        update_terminal_cwd
}

# Extend $PATH
PATH="${HOME}/.local/bin:${HOME}/.local/opt/bin:${PATH}"</code></pre>

<p>Use it for whatever purpose you want. Modify it, extended it, republish it, whatever comes to your mind. You can also use it together with oh-my-zsh or no framework at all if you like that better. Just alter the lines where the framework is loaded or used.</p>

<aside class="advent closing">
    <p>That concludes this year's Advent series. I hope, you enjoyed the series as much as I did and that you learned a bit from it. Many thanks to all of you. At least the response was overwhelming. Much more than I expected. Also many thanks to the people from <a href="http://linux.slashdot.org/story/11/12/01/2312241/linux-advent-calendar-24-outstanding-zsh-gems">Slashdot</a> who decided that this Advent series is worth a run. Never seen so many people here. <img src="/templates/reflinux-2012/img/emoticons/smile.png" alt=":-)" style="display: inline; vertical-align: bottom;" class="emoticon" /></p>
    <p>I wish you all a Merry Christmas and a Happy New Year and I hope to see you guys here again in 2012.</p>
    <p>Thanks for all!<br>
    Janek</p>
</aside>      <div><img src="//www.refining-linux.org/stat/piwik.php?idsite=3&amp;rec=1&amp;action_name=ZSH%20Gem%20%2324%3A%20ZSH%20frameworks" height="1" width="1" alt=""></div>
    ]]>
    </description>

    <dc:publisher>Refining Linux</dc:publisher>
    <dc:creator>nospam@example.com (Janek Bevendorff)</dc:creator>
    <dc:subject>
    Advent calendar 2011, </dc:subject>
    <dc:date>2011-12-23T23:00:00Z</dc:date>
    <wfw:comment>http://www.refining-linux.org/wfwcomment.php?cid=59</wfw:comment>
        <slash:comments>7</slash:comments>
        <wfw:commentRss>http://www.refining-linux.org/rss.php?version=1.0&amp;type=comments&amp;cid=59</wfw:commentRss>
    
    <dc:subject>advent calendar</dc:subject>
<dc:subject>framework</dc:subject>
<dc:subject>shell</dc:subject>
<dc:subject>shell trick</dc:subject>
<dc:subject>zsh</dc:subject>
<dc:subject>zsh gem</dc:subject>

</item>
<item rdf:about="http://www.refining-linux.org/archives/58/guid/">
    <title>ZSH Gem #23: Working with extended regular expressions</title>
    <link>http://www.refining-linux.org/archives/58/ZSH-Gem-23-Working-with-extended-regular-expressions/</link>
    <description>
    <![CDATA[
      <aside class="advent">This article is part of the 2011 Advent calendar series “24 Outstanding ZSH Gems”. Each day between December 1st and December 24th an article will be published as part of this series showing one awesome feature of the Z Shell. Some of the features can of course also be found in other shells such as Bash, but the ZSH implementation is often superior.</aside>

<p>There are two ZSH modules which allow you to easily work with POSIX extended regular expressions (POSIX ERE) or with Perl compatible regular expressions (PCRE) which are even more advanced than POSIX ERE. These two modules are <code>zsh/regex</code> and <code>zsh/pcre</code>. You can use either one of them or both at the same time. That's entirely up to you. I'll show you both.</p>

<p>First let me illustrate <code>zsh/regex</code> a bit which is the simpler one of both. <code>zsh/regex</code> provides, once loaded, the new conditional expression <code>-regex-match</code> which can be used in combination with the <code>[[</code> command (e.g. in <code>if</code> conditions or loops):</p>

<pre class="sourcecode"><code class="language-zsh"># Load module
zmodload zsh/regex

# Execute POSIX ERE
[[ "foobar_123" -regex-match "^([a-zA-Z0-9]+)_([0-9]+)$" ]] &amp;&amp; echo match || echo no match</code></pre>

<p>The condition returns true if the expressions matches, otherwise false. If there are any matches, the whole matching part is stored in the <code>$MATCH</code> parameter and if there are any substrings in parentheses, these parts will be available in the array <code>$match</code> (here in our example you'd have an array with two elements containing <code>foobar</code> and <code>123</code>).</p>

<p>Now that I've shown you <code>zsh/regex</code> let's come to the more complex <code>zsh/pcre</code>. <code>zsh/pcre</code> also provides a new conditional expression called <code>-pcre-match</code> which works about the same as <code>-regex-match</code> except that it accepts Perl compatible regular expressions. So we could rewrite our example from above as follows:</p>

<pre class="sourcecode"><code class="language-zsh"># Load module
zmodload zsh/pcre

# Execute PCRE
[[ "foobar_123" -pcre-match "^(\w+)_(\d+)$" ]] &amp;&amp; echo match || echo no match
</code></pre>

<p>That's a little less to write. But <code>zsh/pcre</code> also provides a few new commands besides the conditional expression <code>-pcre-match</code>. The most important ones to know are <code>pcre_compile</code> and <code>pcre_match</code>. With the first one you compile a regular expression from a string and with the second one you use this compiled regular expression on other strings. That means you always need to use both in combination.</p>

<p>Both commands provide several flag parameters. The most important ones for <code>pcre_compile</code> are <code>-m</code> which will match multi-line patterns, <code>-s</code> which makes the dot pattern (<code>.</code>) match whitespace as well and <code>-i</code> which makes the pattern case-insensitive.</p>

<p>The most important flags for <code>pcre_match</code> are <code>-v</code> and <code>-a</code> which let you set different names for the match variable containing the whole matching part and the match array containing all the substrings from enclosing parentheses (which are again <code>$MATCH</code> and <code>$match</code> by default).</p>

<p>Our example from above with <code>pcre_compile</code> and <code>pcre_match</code> would look like this:</p>

<pre class="sourcecode"><code class="language-zsh">pcre_compile "^(\w+)_(\d+)$"
pcre_match "foobar_123" &amp;&amp; echo match || echo no match</code></pre>

<p>Sometimes it may be more to write, but it also gives you some more flexibility due to the arguments both commands can take. For example, the following simple case-insensitive regular expression</p>

<pre class="sourcecode"><code class="language-zsh">pcre_compile -i "^foobar\s+\d+$"
pcre_match "fOoBaR   123" &amp;&amp; echo match || echo no match</code></pre>

<p>would need such a monster expression if just performed with <code>-pcre-match</code>:</p>

<pre class="sourcecode"><code class="language-zsh">[[ "fOoBaR   123" -pcre-match "^[fF][oO]{2}[bB][aA][rR]\s+\d+$" ]] &amp;&amp; echo match || echo no match</code></pre>

<p>In this case the second variant is not just more to type (yes, that's true, count the characters), the first one is also much easier to read and less error-prone so I'd prefer that one.</p>

<p>Whichever variant you take and whether you prefer POSIX regular expressions or PCRE always depends on the situation. But all of them give you the full power of regular expressions. So use them!</p>

<p>Read more about zsh/regex and zsh/pcre:</p>
<ul>
    <li><a href="http://zsh.sourceforge.net/Doc/Release/Zsh-Modules.html#The-zsh_002fregex-Module">zsh.sf.net: The zsh/regex Module</a></li>
    <li><a href="http://zsh.sourceforge.net/Doc/Release/Zsh-Modules.html#The-zsh_002fpcre-Module">zsh.sf.net: The zsh/pcre Module</a></li>
    <li><a href="http://zsh.sourceforge.net/Doc/Release/Conditional-Expressions.html">zsh.sf.net: Conditional Expressions</a></li>
    <li><a href="http://en.wikipedia.org/wiki/Regular_expression">Wikipedia.org: Regular expression</a></li>
</ul>      <div><img src="//www.refining-linux.org/stat/piwik.php?idsite=3&amp;rec=1&amp;action_name=ZSH%20Gem%20%2323%3A%20Working%20with%20extended%20regular%20expressions" height="1" width="1" alt=""></div>
    ]]>
    </description>

    <dc:publisher>Refining Linux</dc:publisher>
    <dc:creator>nospam@example.com (Janek Bevendorff)</dc:creator>
    <dc:subject>
    Advent calendar 2011, </dc:subject>
    <dc:date>2011-12-22T23:00:00Z</dc:date>
    <wfw:comment>http://www.refining-linux.org/wfwcomment.php?cid=58</wfw:comment>
        <slash:comments>0</slash:comments>
        <wfw:commentRss>http://www.refining-linux.org/rss.php?version=1.0&amp;type=comments&amp;cid=58</wfw:commentRss>
    
    <dc:subject>advent calendar</dc:subject>
<dc:subject>regular expression</dc:subject>
<dc:subject>shell</dc:subject>
<dc:subject>shell trick</dc:subject>
<dc:subject>zsh</dc:subject>
<dc:subject>zsh gem</dc:subject>

</item>
<item rdf:about="http://www.refining-linux.org/archives/57/guid/">
    <title>ZSH Gem #22: Accessing and editing files with mapfile</title>
    <link>http://www.refining-linux.org/archives/57/ZSH-Gem-22-Accessing-and-editing-files-with-mapfile/</link>
    <description>
    <![CDATA[
      <aside class="advent">This article is part of the 2011 Advent calendar series “24 Outstanding ZSH Gems”. Each day between December 1st and December 24th an article will be published as part of this series showing one awesome feature of the Z Shell. Some of the features can of course also be found in other shells such as Bash, but the ZSH implementation is often superior.</aside>

<p>Working on the shell is often working with files and sometimes you need to read or edit their contents. Normally you'd do that with the command line editor of your choice (e.g. nano, vi, vim or emacs), but sometimes you need to write the output of a command or a pipe to a file or feed programs with contents from the hard disk. That's usually done by using the input and output redirection operators, but ZSH gives you one more tool which can sometimes make things easier. This module is called <code>mapfile</code>.</p>

<p>Since <code>mapfile</code> is a ZSH module you need to enable it before you can use it:</p>

<pre class="sourcecode"><code class="language-zsh">zmodload zsh/mapfile</code></pre>

<p>Once the module is loaded, you get the magic associative array <code>$mapfile</code> which gives you direct access to any file when you specify its name as the name of the array key. For example, to <code>echo</code> the contents of the file <code>examplefile</code> run </p>

<pre class="sourcecode"><code class="language-zsh">echo $mapfile[examplefile]</code></pre>

<p>You can also write to files by assigning values to an entry of the array. The value will then be written to disk. If the file does not exist yet, it will be created. To write the string <em>Hello World</em> to our file, run </p>

<pre class="sourcecode"><code class="language-zsh">$mapfile[examplefile]="Hello World"</code></pre>

<p>That's the basics and about everything <code>mapfile</code> can do. How do we make use of it? <code>mapfile</code> can sometimes be a nice thing when you need to work with contents of a file in a very simple way. By using <code>mapfile</code> you can avoid piping the contents through chains of commands or doing stuff like this:</p>

<pre class="sourcecode"><code class="language-zsh">filecontents=$(cat file)</code></pre>

<p><code>mapfile</code> also enables you to do some basic filtering and editing directly when accessing the file by using ZSH's parameter expansion operators. For example if you need the contents of a file converted to all lowercase, the only thing you need to do do is </p>

<pre class="sourcecode"><code class="language-zsh">echo ${(L)mapfile[file]}</code></pre>

<p>or if you need the contents of a file or a fallback value in case the file doesn't exist, <code>mapfile</code> should be the easiest way to go:</p>

<pre class="sourcecode"><code class="language-zsh">echo ${mapfile[file]:-Fallback value}</code></pre>

<p>or if you need to do some initial search and replace:</p>

<pre class="sourcecode"><code class="language-zsh">echo ${mapfile[file]//search/replace}</code></pre>

<p>Of course, you can also write the edited contents back to disk:</p>

<pre class="sourcecode"><code class="language-zsh">mapfile[file]=${mapfile[file]//search/replace}</code></pre>

<p>No big deal. Another thing you could do is to get the length of a textfile in characters (i.e. bytes):</p>

<pre class="sourcecode"><code class="language-zsh">echo ${#mapfile[file]}</code></pre>

<p>Whatever you want! You can also use <code>mapfile</code> in combination with <code>vared</code> to let the user edit files interactively:</p>

<pre class="sourcecode"><code class="language-zsh">vared mapfile[file]</code></pre>

<p>Of course, you could also open some more advanced editor such as vim, but at least as a fallback, <code>mapfile</code> in combination with <code>vared</code> becomes a valuable tool because no external program is required.</p>

<p>You see, under some circumstances, <code>mapfile</code> can really save your neck. But you should also be aware, that it may also consume a lot of memory, particularly with large files. So use it with caution.</p>

<p>Read more about mapfile:</p>
<ul>
    <li><a href="http://zsh.sourceforge.net/Doc/Release/Zsh-Modules.html#The-zsh_002fmapfile-Module">zsh.sf.net: The zsh/mapfile Module</a></li>
    <li><a href="http://grml.org/zsh/zsh-lovers.html#_zsh_mapfile_require_zmodload_zsh_mapfile">ZSH-LOVERS: zsh/mapfile</a></li>
    <li><a href="http://zsh.sourceforge.net/Doc/Release/Expansion.html#Parameter-Expansion">zsh.sf.net: Parameter Expansion</a></li>
</ul>      <div><img src="//www.refining-linux.org/stat/piwik.php?idsite=3&amp;rec=1&amp;action_name=ZSH%20Gem%20%2322%3A%20Accessing%20and%20editing%20files%20with%20mapfile" height="1" width="1" alt=""></div>
    ]]>
    </description>

    <dc:publisher>Refining Linux</dc:publisher>
    <dc:creator>nospam@example.com (Janek Bevendorff)</dc:creator>
    <dc:subject>
    Advent calendar 2011, </dc:subject>
    <dc:date>2011-12-21T23:00:00Z</dc:date>
    <wfw:comment>http://www.refining-linux.org/wfwcomment.php?cid=57</wfw:comment>
        <slash:comments>1</slash:comments>
        <wfw:commentRss>http://www.refining-linux.org/rss.php?version=1.0&amp;type=comments&amp;cid=57</wfw:commentRss>
    
    <dc:subject>advent calendar</dc:subject>
<dc:subject>shell</dc:subject>
<dc:subject>shell trick</dc:subject>
<dc:subject>zsh</dc:subject>
<dc:subject>zsh gem</dc:subject>

</item>

</rdf:RDF>
