TheRosiek.com Random tech notes and tutorials

29Jan/102

Duplicity Install and Backup Samples

Duplicity is a backup tool that works off of rsync and rdiff libraries to copy only changes to a backup location. It can use compression and encryption tools on the data and also has the ability to save to Amazon's S3 service. More details can be found here.


Installation on OpenBSD 4.4

The 4.4 version was the most difficult to get working since the majority of the issues came from the given OpenBSD libraries. Even installing the Duplicity port from the packages didn't function right.

First we need to add a few packages. You can use the pkg_add function with whatever mirror to obtain the following, some depend on others so there will be others in the file install list:

  • python-2.5.2p4
  • py-boto-1.3
  • gpgme-1.1.5
  • librsync-0.9.7
  • ncftp-3.2.1

When the main Python package is installed, it will ask you to create a few symbolic links, so create those.

ln -sf /usr/local/bin/python2.5 /usr/local/bin/python
ln -sf /usr/local/bin/pydoc2.5  /usr/local/bin/pydoc

Version 4.4 needs a separate Python XML package to work properly. If it's not installed, you'll get a series of errors when trying to send data to S3; I believe the XML error is when it tries to read the response. Something like this will error out:

Traceback (most recent call last):
  File "/usr/local/bin/duplicity", line 482, in <module>
    with_tempdir(main)
  File "/usr/local/bin/duplicity", line 477, in with_tempdir
    fn()
  File "/usr/local/bin/duplicity", line 468, in main
    full_backup(col_stats)
  File "/usr/local/bin/duplicity", line 174, in full_backup
    col_stats.set_values(sig_chain_warning = None).cleanup_signatures()
  File "/usr/obj/ports/duplicity-0.4.12/fake-amd64/usr/local/lib/python2.5/site-packages/duplicity/collections.py", line 476, in set_values
  File "/usr/obj/ports/duplicity-0.4.12/fake-amd64/usr/local/lib/python2.5/site-packages/duplicity/backends.py", line 802, in list
  File "/usr/local/lib/python2.5/site-packages/boto/s3/bucketlistresultset.py", line 31, in bucket_lister
    delimiter=delimiter)
  File "/usr/local/lib/python2.5/site-packages/boto/s3/bucket.py", line 205, in get_all_keys
    xml.sax.parseString(body, h)
  File "/usr/local/lib/python2.5/xml/sax/__init__.py", line 43, in parseString
    parser = make_parser()
  File "/usr/local/lib/python2.5/xml/sax/__init__.py", line 93, in make_parser
    raise SAXReaderNotAvailable("No parsers found", None)
xml.sax._exceptions.SAXReaderNotAvailable: No parsers found

To avoid that, a separate Python XML package needs to be downloaded and installed:

cd /usr/src
wget http://downloads.sourceforge.net/project/pyxml/pyxml/0.8.4/PyXML-0.8.4.tar.gz
tar zxvf PyXML-0.8.4.tar.gz
cd PyXML-0.8.4
python setup.py install

Now we can install Duplicity.

cd /usr/src
wget http://code.launchpad.net/duplicity/0.6-series/0.6.06/+download/duplicity-0.6.06.tar.gz
cd duplicity-0.6.06
python setup.py --librsync-dir=/usr/local build
python setup.py install --prefix=/usr/local

If you run the Duplicity jobs as root in a cron job, there is something about OpenBSD (I'm sure a security issue) that causes it to fail. I would get the output below in my log only when it ran as a cron job:

Traceback (most recent call last):
  File "/usr/local/bin/duplicity", line 583, in <module>
    with_tempdir(main)
  File "/usr/local/bin/duplicity", line 577, in with_tempdir
    fn()
  File "/usr/local/bin/duplicity", line 558, in main
    full_backup(col_stats)
  File "/usr/local/bin/duplicity", line 234, in full_backup
    bytes_written = write_multivol("full", tarblock_iter, globals.backend)
  File "/usr/local/bin/duplicity", line 148, in write_multivol
    globals.gpg_profile, globals.volsize)
  File "/usr/local/lib/python2.5/site-packages/duplicity/gpg.py", line 240, in GPGWriteFile
    bytes_to_go = data_size - get_current_size()
  File "/usr/local/lib/python2.5/site-packages/duplicity/gpg.py", line 232, in get_current_size
    return os.stat(filename).st_size
OSError: [Errno 2] No such file or directory:'/tmp/duplicity-gM4CN9-tempdir/mktemp-iZknw0-2'

Odd that it can't read the temporary folder that it created. Changing the folder location also did not work. The solution is to create a separate user for only backups. The can be an issue if you have files that cannot be read by all users and need backup, but I found in my case this worked for the specific files that needed to be saved.

useradd -m -d /home/dpbackup -c 'Duplicity' dpbackup
usermod -G nogroup dpbackup
mkdir /home/dpbackup/log

Make sure to add the new user to the deny list in SSH with DenyUsers dpbackup in the file /etc/ssh/sshd_config; there isn't any reason for it to log in.

Now su as this new user. A GPG key needs to be created so that the compressed backups can be encrypted and signed. This way no one else that may have access to our S3 account (Amazon employees) can read the data.

su dpbackup
$ cd
$ gpg --list-keys
gpg: directory `/root/.gnupg' created
gpg: new configuration file `/root/.gnupg/gpg.conf'
created
gpg: WARNING: options in `/root/.gnupg/gpg.conf' are not yet active during this run
gpg: keyring `/root/.gnupg/pubring.gpg'
created
gpg: /root/.gnupg/trustdb.gpg: trustdb created

$ gpg --gen-key

There will be a series of questions, most of the defaults are fine.

  • Choose option 1 for DSA and Elgamal (the default)
  • Choose the default key size of 2048
  • Leave the default that the key will not expire, option 0
  • Enter a User ID, Email address, and comment for the key.
  • Type O for OK to accept.
  • Enter a long passphrase for the key and allow it to be generated. I usually do at least 20 characters since the password will just sit in a script anyway.

Move the keys to some other safe place so that they can't be lost. No key means the backups are worthless. Typically a second backup source is a good idea.

$ tar cf gpg_keys.tar .gnupg/
$ chmod 600 gpg_keys.tar

See sample scripts below for backup jobs.


Installation on OpenBSD 4.6

First we need to add a few packages. You can use the pkg_add function with whatever mirror to obtain the following, some depend on others so there will be others in the file install list:

  • python-2.5.4p1
  • py-xml-0.8.4p8
  • py-boto-1.7a
  • gpgme-1.1.5p0
  • librsync-0.9.7p0
  • ncftp-3.2.2

When the main Python package is installed, it will ask you to create a few symbolic links, so create those.

ln -sf /usr/local/bin/python2.5 /usr/local/bin/python
ln -sf /usr/local/bin/python2.5-config /usr/local/bin/python-config
ln -sf /usr/local/bin/pydoc2.5 /usr/local/bin/pydoc

Now we can install Duplicity.

cd /usr/src
wget http://code.launchpad.net/duplicity/0.6-series/0.6.06/+download/duplicity-0.6.06.tar.gz
cd duplicity-0.6.06
python setup.py --librsync-dir=/usr/local build
python setup.py install --prefix=/usr/local

If you run the Duplicity jobs as root in a cron job, there is something about OpenBSD (I'm sure a security issue) that causes it to fail. I would get the output below in my log only when it ran as a cron job:

Traceback (most recent call last):
  File "/usr/local/bin/duplicity", line 583, in <module>
    with_tempdir(main)
  File "/usr/local/bin/duplicity", line 577, in with_tempdir
    fn()
  File "/usr/local/bin/duplicity", line 558, in main
    full_backup(col_stats)
  File "/usr/local/bin/duplicity", line 234, in full_backup
    bytes_written = write_multivol("full", tarblock_iter, globals.backend)
  File "/usr/local/bin/duplicity", line 148, in write_multivol
    globals.gpg_profile, globals.volsize)
  File "/usr/local/lib/python2.5/site-packages/duplicity/gpg.py", line 240, in GPGWriteFile
    bytes_to_go = data_size - get_current_size()
  File "/usr/local/lib/python2.5/site-packages/duplicity/gpg.py", line 232, in get_current_size
    return os.stat(filename).st_size
OSError: [Errno 2] No such file or directory:'/tmp/duplicity-gM4CN9-tempdir/mktemp-iZknw0-2'

Odd that it can't read the temporary folder that it created. Changing the folder location also did not work. The solution is to create a separate user for only backups. The can be an issue if you have files that cannot be read by all users and need backup, but I found in my case this worked for the specific files that needed to be saved.

useradd -m -d /home/dpbackup -c 'Duplicity' dpbackup
usermod -G nogroup dpbackup
mkdir /home/dpbackup/log

Make sure to add the new user to the deny list in SSH with DenyUsers dpbackup in the file /etc/ssh/sshd_config; there isn't any reason for it to log in.

Now su as this new user. A GPG key needs to be created so that the compressed backups can be encrypted and signed. This way no one else that may have access to our S3 account (Amazon employees) can read the data.

su dpbackup
$ cd
$ gpg --list-keys
gpg: directory `/root/.gnupg' created
gpg: new configuration file `/root/.gnupg/gpg.conf'
created
gpg: WARNING: options in `/root/.gnupg/gpg.conf' are not yet active during this run
gpg: keyring `/root/.gnupg/pubring.gpg'
created
gpg: /root/.gnupg/trustdb.gpg: trustdb created

$ gpg --gen-key

There will be a series of questions, most of the defaults are fine.

  • Choose option 1 for DSA and Elgamal (the default)
  • Choose the default key size of 2048
  • Leave the default that the key will not expire, option 0
  • Enter a User ID, Email address, and comment for the key.
  • Type O for OK to accept.
  • Enter a long passphrase for the key and allow it to be generated. I usually do at least 20 characters since the password will just sit in a script anyway.

Move the keys to some other safe place so that they can't be lost. No key means the backups are worthless. Typically a second backup source is a good idea.

$ tar cf gpg_keys.tar .gnupg/
$ chmod 600 gpg_keys.tar

See sample scripts below for backup jobs.


Installation on Debian Lenny 5.0

The Debian install is a little bit simpler and can run the backup job as root inside cron. Get some install packages first:

apt-get install build-essential librsync1 librsync-dev python python-gnupginterface ncftp python-pexpect python-dev python-boto

Install Duplicity:

cd /usr/src
wget http://code.launchpad.net/duplicity/0.6-series/0.6.06/+download/duplicity-0.6.06.tar.gz
tar zxvf duplicity-0.6.06.tar.gz
cd duplicity-0.6.06
python setup.py build
python setup.py install

Creating a user is optional, but good security practice for it not to be root.

useradd -m -d /home/dpbackup -c 'Duplicity' dpbackup
mkdir /home/dpbackup/log

Make sure to add the new user to the deny list in SSH with DenyUsers dpbackup in the file /etc/ssh/sshd_config; there isn't any reason for it to log in.

Now su as this new user. A GPG key needs to be created so that the compressed backups can be encrypted and signed. This way no one else that may have access to our S3 account (Amazon employees) can read the data.

su dpbackup
$ cd
$ gpg --list-keys
gpg: directory `/root/.gnupg' created
gpg: new configuration file `/root/.gnupg/gpg.conf'
created
gpg: WARNING: options in `/root/.gnupg/gpg.conf' are not yet active during this run
gpg: keyring `/root/.gnupg/pubring.gpg'
created
gpg: /root/.gnupg/trustdb.gpg: trustdb created

$ gpg --gen-key

There will be a series of questions, most of the defaults are fine.

  • Choose option 1 for DSA and Elgamal (the default)
  • Choose the default key size of 2048
  • Leave the default that the key will not expire, option 0
  • Enter a User ID, Email address, and comment for the key.
  • Type O for OK to accept.
  • Enter a long passphrase for the key and allow it to be generated. I usually do at least 20 characters since the password will just sit in a script anyway.

Move the keys to some other safe place so that they can't be lost. No key means the backups are worthless. Typically a second backup source is a good idea.

$ tar cf gpg_keys.tar .gnupg/
$ chmod 600 gpg_keys.tar

See sample scripts below for backup jobs.


Sample Backup Scripts

The first portion of the script defines the variables we'll need to use. The AWS keys are defined for you when you sign up for S3. Passphrase is the GPG passphrase set on the key generated from gpg --gen-key. The S3 bucket should be fairly unique, so I use the host name of the server. The others are pretty obvious but will be explained later.

#!/bin/sh

# Variables
export AWS_ACCESS_KEY_ID=ABABAB3333338888WWWW
export AWS_SECRET_ACCESS_KEY=BBBBBBBBBBTTTTTTTTTT8888888888VVVVVVVVVV
export PASSPHRASE=somelongpassphrase
DBHOST='dbserver1'
TIMESTAMP=`date +%m%d%Y%H%M`
FILE_PREFIX_DB='mydb_'
FILE_PREFIX_SVN_REPO='repo_'
GPG_PUB_KEY='AAEE66BB'
BACKUP_LOG_FILE='/home/dpbackup/log/s3_backup.log'
FULL_IF_OLDER_THAN='7D'
KEEP_MAX_SETS='2'
S3_BUCKET='serverhostname'
CURRENT_HOST='server-hostname'
TO_EMAIL='sysadmin@example.com'

Just some sample backup methods for MySQL or Subversion if needed.

/usr/local/bin/mysqldump -h $DBHOST -u mysql_admin -pmypass mydb > /home/dpbackup/mysql/$FILE_PREFIX_DB$TIMESTAMP.sql
/usr/local/bin/svnadmin dump /home/svn/repo > /home/dpbackup/svn/$FILE_PREFIX_SVN_REPO$TIMESTAMP.svnbk

This is only necessary on OpenBSD since it's a security feature. We open it up now from 128 and close it back down later.

# Increase open file limit
ulimit -n 1024

Most of these options can be read in the man page of Duplicity, and there are many more to choose from. Basically this backup is going to do a full backup ever 7 days (from the $FULL_IF_OLDER_THAN variable), and use encryption with the highest bzip compression, before sending it to S3. It will write a fresh backup log to the defined file, which we'll email out later.

# Backup to S3
/usr/local/bin/duplicity --s3-use-new-style --tempdir /home/dpbackup --full-if-older-than $FULL_IF_OLDER_THAN --encrypt-key "$GPG_PUB_KEY" --sign-key "$GPG_PUB_KEY" --gpg-options='--compress-algo=bzip2 --bzip2-compress-level=9' --include /etc/apache2 --include /home/dpbackup/svn --include /home/dpbackup/mysql --exclude '**' / s3+http://$S3_BUCKET > $BACKUP_LOG_FILE

This line just gives us some space in the log file; really it's just for email formatting.

# Separate the log file a bit
echo -e '\n\n==== REMOVE OLD BACKUP SETS ====\n\n' >> $BACKUP_LOG_FILE

This command will check how many full backup sets are already on S3, and remove any more than what is defined in KEEP_MAX_SETS.

# Clean out backup sets older than variable sets
/usr/local/bin/duplicity remove-all-but-n-full $KEEP_MAX_SETS s3+http://$S3_BUCKET >> $BACKUP_LOG_FILE

Again, for formatting purposes.

# Separate the log file a bit
echo -e '\n\n==== CURRENT FILES IN BACKUP SET  ====\n\n' >> $BACKUP_LOG_FILE

This command lists out the current files in our backup set so they can be reviewed in the email, making sure everything is working out it should.

# List all files in backup set for verification
/usr/local/bin/duplicity list-current-files s3+http://$S3_BUCKET >> $BACKUP_LOG_FILE

Now we can mail out the log file. The -s flag is for the subject line, and the TO_EMAIL is defined in our variables. We're just writing the log file as the body of the email.

# Mail out log to sysadmins for verification
mail -s "$CURRENT_HOST Backup Log for $TIMESTAMP" $TO_EMAIL < $BACKUP_LOG_FILE

Since we exported the keys and passphrases, we want to make sure we don't leave those around any longer than we have to; set them null.

# Clear secret variables
export AWS_ACCESS_KEY_ID=
export AWS_SECRET_ACCESS_KEY=
export PASSPHRASE=

Just a little clean up so we don't waste space.

# Remove old and temporary files
rm /home/dpbackup/mysql/*
rm /home/dpbackup/svn/*

This is for OpenBSD only. Since we opened the open file limit up at the beginning of the script, close it back down.

# Put open file limit back to default
ulimit -n 128

End it.

# Exit
exit 0
24Oct/090

Configuring Sendmail

Operating System: OpenBSD 4.4

Sendmail is configured and enabled by default in OpenBSD, but it only allows you to send mail out from the machine itself (on localhost, as it should). These steps will allow you to relay from the server and set relay restrictions.

As root, make a copy of the original localhost config file to one of your own.

cd /usr/share/sendmail/cf
cp openbsd-localhost.mc openbsd-myconfig.mc

Open the file you just created and comment out the line:

FEATURE(`accept_unresolvable_domans')dnl

by adding dnl to the the front to read

dnlFEATURE(`accept_unresolvable_domans')dnl

Then modify this line so that Sendmail will listen on all interfaces rather than just local:

DAEMON_OPTIONS(`Family=inet, address=127.0.0.1, Name=MTA')dnl

to read...

DAEMON_OPTIONS(`Family=inet, address=0.0.0.0, Name=MTA')dnl

Now compile the configuration that you created and make it the default Sendmail config:

m4 ../m4/cf.m4 openbsd-myconfig.mc > /etc/mail/sendmail.cf

Open /etc/mail/relay-domains and add IP addresses/ranges that are allowed to relay through the server. The format used is: 192.168.1 which is equivalent to 192.168.1.0/24. This will allow other hosts on your network to relay mail through this server.

Modify /etc/rc.conf and replace:

sendmail_flags="-L sm-mta -C/etc/mail/localhost.cf -bd -q30m";

with...

sendmail_flags="-L sm-mta -C/etc/mail/sendmail.cf -bd -q2d"

This will tell the flags to use our newly created .cf file we compiled earlier. I usually change the q30m (which means keep things in the queue for 30 minutes) to q2d, keeping the queue active for 2 days before ditching it.

Do a clean reboot and make sure the correct configuration comes up. You can test access by using a server with the same subnet as in your "relay-domains" file and telnet-ing to port 25.

You can restart Sendmail quickly by killing the process first...

kill `head -n1 /var/run/sendmail.pid`

...and then restarting:

. /etc/rc.conf
/usr/sbin/sendmail $sendmail_flags
24Oct/090

Subversion – Installation, Configuration, and Use

Operating System: OpenBSD 4.4


Installation

First grab the necessary compiled packages from OpenBSD.

export PKG_PATH=ftp://carroll.cac.psu.edu/pub/OpenBSD/4.4/packages/amd64
pkg_add db-4.6.21.tgz neon-0.26.2.tgz

Then get the Apache source code for the HTTP server, configure and install. Use a 2.2.x version.

cd /usr/src
http://www.gtlib.gatech.edu/pub/apache/httpd/httpd-2.2.x.tar.gz
tar zxvf httpd-2.2.x.tar.gz
cd http-2.2.x
./configure --with-included-apr --with-berkeley-db=/usr/local --enable-shared=yes --enable-dav --enable-so --enable-rewrite --enable-ssl
make
make install

Next get the newest Subversion source code, configure and install.

cd /usr/src
wget subversion-1.5.x.tar.gz
tar zxvf subversion-1.5.x.tar.gz
cd subversion-1.5.x
./configure --with-apr=/usr/local/apache2/bin/apr-1-config --with-apxs=/usr/local/apache2/bin/apxs --with-neon=/usr/local

Add the proper user to run the httpd daemon

useradd -u3690 -g=uid -c"Apache2" -d/var/empty -s/sbin/nologin _apache2

Configuration

Setup the initial repository with the svncreate command and make the user running the web service the owner, since they will be the user actually modifying the repository files.

mkdir /home/svn
svnadmin create /home/svn/myproject
chown -R _apache2:_apache2 /home/svn/

Now edit your main httpd.conf file in /usr/local/apache2/conf/ to read these changes. They're not all in the same place, just scattered throughout the file. The first two changes should already be there after installing the Subversion source, just require slight modification. The last "location" change you'll need to add manually. You'll see the dav_svn* files in there, we'll get to those next.

LoadModule dav_svn_module     modules/mod_dav_svn.so
LoadModule authz_svn_module   modules/mod_authz_svn.so
...
User _apache2
Group _apache2
...
<Location /svn>
  DAV svn
  SVNListParentPath on
  SVNParentPath /home/svn
    AuthType Basic
    AuthName "Subversion Repository"
    AuthUserFile /etc/svn/dav_svn.passwd
    AuthzSVNAccessFile /etc/svn/dav_svn.control
    Require valid-user
</Location>

Now we can create the username/password files along with the access files.

mkdir /etc/svn
touch /etc/svn/dav_svn.passwd
htpasswd -mb /etc/svn/dav_svn.passwd myuser mypassword

Create the access file to your repositories.

touch /etc/svn/dav_svn.control

And now edit the file. You can set users using r and rw access writes. First you list the repository, and then the folder location after that for more fine grained permissions.

[myproject:/]
myuser = r

[myproject:/trunk/base/code]
myuser = rw

Naturally you'll want to lock this service down with SSL and possibly make it available outside the network. To simply create a self-signed certificate and add it to Apache, do the following.

openssl genrsa -out /etc/ssl/private/svnserver.key 1024
openssl req -new -key /etc/ssl/private/svnserver.key -out /etc/ssl/private/svnserver.csr
openssl x509 -req -days 365 -in /etc/ssl/private/svnserver.csr -signkey /etc/ssl/private/svnserver.key -out /etc/ssl/svnserver.crt

Now add the lines in the httpd.conf file in /usr/local/apache2/conf/ just about the Location setting.

Listen 443
SSLEngine on
SSLCertificateFile    /etc/ssl/svnserver.crt
SSLCertificateKeyFile /etc/ssl/private/svnserver.key

Edit the rc.conf.local file in /etc/ to turn on Apache.

apache2=YES

And then edit the rc.local file to auto start Apache.

# Apache2 Startup
if [ X"${apache2}" == X"YES" -a -x /usr/local/apache2/bin/httpd ]; then
   /usr/local/apache2/bin/apachectl start &amp;
   echo -n " apache2";
fi

As well as the shutdown file rc.shutdown to kill the process.

# Apache2 Shutdown
if [ X"${apache2}" == X"YES" -a -x /usr/local/apache2/bin/httpd ]; then
   /usr/local/apache2/bin/apachectl stop &amp;
   echo -n " apache2";
fi

Now reboot the server and test access; it should start up automatically.


Maintenance and Use

The best way to use SVN over HTTPS is with Tortoise for Windows or some other tool if using Linux, like RapidSVN.

Adding Additional Users

To add more users, just run the htpasswd command linked to your dav_svn.passwd file, same as the initial configuration for users.

htpasswd -mb /etc/svn/dav_svn.passwd newuser newpassword

And now edit the access file containing the other users and defined in the Apache configuration. You can set users using r and rw access writes. First you list the repository, and then the folder location after that for more fine grained permissions.

[myproject:/]
myuser = r
newuser = r

[myproject:/trunk/base/code]
myuser = rw
newuser = rw

Backing Up the Repositories

To backup a repository, use the svnadmin dump command which will export the entire database and revisions. You can then tar up and gzip the dump file for compression, and back it up to tape or disk somewhere else. There are also incremental backups that can be done of disk/tape space is an issue.

svnadmin dump /home/svn/myproject > /home/backups/myproject_dumpfile

Restoring the Repositories

Restoring the SVN database is simply rewriting all the revisions from the dump back into a database. The restore process also works well for moving an older repository over to a new one since restoring the dump into a new SVN database will update it to that version.

svnadmin create /home/svn/restoredproject
svnadmin load /home/svn/restoredproject < /home/backups/myproject_dumpfile