Duplicity Install and Backup Samples
Duplicity is a backup tool that works off of rsync and rdiff libraries to copy only changes to a backup location. It can use compression and encryption tools on the data and also has the ability to save to Amazon's S3 service. More details can be found here.
- Installation on OpenBSD 4.4
- Installation on OpenBSD 4.6
- Installation on Debian Lenny 5.0
- Sample Backup Scripts
Installation on OpenBSD 4.4
The 4.4 version was the most difficult to get working since the majority of the issues came from the given OpenBSD libraries. Even installing the Duplicity port from the packages didn't function right.
First we need to add a few packages. You can use the pkg_add function with whatever mirror to obtain the following, some depend on others so there will be others in the file install list:
- python-2.5.2p4
- py-boto-1.3
- gpgme-1.1.5
- librsync-0.9.7
- ncftp-3.2.1
When the main Python package is installed, it will ask you to create a few symbolic links, so create those.
ln -sf /usr/local/bin/pydoc2.5 /usr/local/bin/pydoc
Version 4.4 needs a separate Python XML package to work properly. If it's not installed, you'll get a series of errors when trying to send data to S3; I believe the XML error is when it tries to read the response. Something like this will error out:
File "/usr/local/bin/duplicity", line 482, in <module>
with_tempdir(main)
File "/usr/local/bin/duplicity", line 477, in with_tempdir
fn()
File "/usr/local/bin/duplicity", line 468, in main
full_backup(col_stats)
File "/usr/local/bin/duplicity", line 174, in full_backup
col_stats.set_values(sig_chain_warning = None).cleanup_signatures()
File "/usr/obj/ports/duplicity-0.4.12/fake-amd64/usr/local/lib/python2.5/site-packages/duplicity/collections.py", line 476, in set_values
File "/usr/obj/ports/duplicity-0.4.12/fake-amd64/usr/local/lib/python2.5/site-packages/duplicity/backends.py", line 802, in list
File "/usr/local/lib/python2.5/site-packages/boto/s3/bucketlistresultset.py", line 31, in bucket_lister
delimiter=delimiter)
File "/usr/local/lib/python2.5/site-packages/boto/s3/bucket.py", line 205, in get_all_keys
xml.sax.parseString(body, h)
File "/usr/local/lib/python2.5/xml/sax/__init__.py", line 43, in parseString
parser = make_parser()
File "/usr/local/lib/python2.5/xml/sax/__init__.py", line 93, in make_parser
raise SAXReaderNotAvailable("No parsers found", None)
xml.sax._exceptions.SAXReaderNotAvailable: No parsers found
To avoid that, a separate Python XML package needs to be downloaded and installed:
wget http://downloads.sourceforge.net/project/pyxml/pyxml/0.8.4/PyXML-0.8.4.tar.gz
tar zxvf PyXML-0.8.4.tar.gz
cd PyXML-0.8.4
python setup.py install
Now we can install Duplicity.
wget http://code.launchpad.net/duplicity/0.6-series/0.6.06/+download/duplicity-0.6.06.tar.gz
cd duplicity-0.6.06
python setup.py --librsync-dir=/usr/local build
python setup.py install --prefix=/usr/local
If you run the Duplicity jobs as root in a cron job, there is something about OpenBSD (I'm sure a security issue) that causes it to fail. I would get the output below in my log only when it ran as a cron job:
File "/usr/local/bin/duplicity", line 583, in <module>
with_tempdir(main)
File "/usr/local/bin/duplicity", line 577, in with_tempdir
fn()
File "/usr/local/bin/duplicity", line 558, in main
full_backup(col_stats)
File "/usr/local/bin/duplicity", line 234, in full_backup
bytes_written = write_multivol("full", tarblock_iter, globals.backend)
File "/usr/local/bin/duplicity", line 148, in write_multivol
globals.gpg_profile, globals.volsize)
File "/usr/local/lib/python2.5/site-packages/duplicity/gpg.py", line 240, in GPGWriteFile
bytes_to_go = data_size - get_current_size()
File "/usr/local/lib/python2.5/site-packages/duplicity/gpg.py", line 232, in get_current_size
return os.stat(filename).st_size
OSError: [Errno 2] No such file or directory:'/tmp/duplicity-gM4CN9-tempdir/mktemp-iZknw0-2'
Odd that it can't read the temporary folder that it created. Changing the folder location also did not work. The solution is to create a separate user for only backups. The can be an issue if you have files that cannot be read by all users and need backup, but I found in my case this worked for the specific files that needed to be saved.
usermod -G nogroup dpbackup
mkdir /home/dpbackup/log
Make sure to add the new user to the deny list in SSH with DenyUsers dpbackup in the file /etc/ssh/sshd_config; there isn't any reason for it to log in.
Now su as this new user. A GPG key needs to be created so that the compressed backups can be encrypted and signed. This way no one else that may have access to our S3 account (Amazon employees) can read the data.
$ cd
$ gpg --list-keys
gpg: directory `/root/.gnupg' created
gpg: new configuration file `/root/.gnupg/gpg.conf' created
gpg: WARNING: options in `/root/.gnupg/gpg.conf' are not yet active during this run
gpg: keyring `/root/.gnupg/pubring.gpg' created
gpg: /root/.gnupg/trustdb.gpg: trustdb created
$ gpg --gen-key
There will be a series of questions, most of the defaults are fine.
- Choose option 1 for DSA and Elgamal (the default)
- Choose the default key size of 2048
- Leave the default that the key will not expire, option 0
- Enter a User ID, Email address, and comment for the key.
- Type O for OK to accept.
- Enter a long passphrase for the key and allow it to be generated. I usually do at least 20 characters since the password will just sit in a script anyway.
Move the keys to some other safe place so that they can't be lost. No key means the backups are worthless. Typically a second backup source is a good idea.
$ chmod 600 gpg_keys.tar
See sample scripts below for backup jobs.
Installation on OpenBSD 4.6
First we need to add a few packages. You can use the pkg_add function with whatever mirror to obtain the following, some depend on others so there will be others in the file install list:
- python-2.5.4p1
- py-xml-0.8.4p8
- py-boto-1.7a
- gpgme-1.1.5p0
- librsync-0.9.7p0
- ncftp-3.2.2
When the main Python package is installed, it will ask you to create a few symbolic links, so create those.
ln -sf /usr/local/bin/python2.5-config /usr/local/bin/python-config
ln -sf /usr/local/bin/pydoc2.5 /usr/local/bin/pydoc
Now we can install Duplicity.
wget http://code.launchpad.net/duplicity/0.6-series/0.6.06/+download/duplicity-0.6.06.tar.gz
cd duplicity-0.6.06
python setup.py --librsync-dir=/usr/local build
python setup.py install --prefix=/usr/local
If you run the Duplicity jobs as root in a cron job, there is something about OpenBSD (I'm sure a security issue) that causes it to fail. I would get the output below in my log only when it ran as a cron job:
File "/usr/local/bin/duplicity", line 583, in <module>
with_tempdir(main)
File "/usr/local/bin/duplicity", line 577, in with_tempdir
fn()
File "/usr/local/bin/duplicity", line 558, in main
full_backup(col_stats)
File "/usr/local/bin/duplicity", line 234, in full_backup
bytes_written = write_multivol("full", tarblock_iter, globals.backend)
File "/usr/local/bin/duplicity", line 148, in write_multivol
globals.gpg_profile, globals.volsize)
File "/usr/local/lib/python2.5/site-packages/duplicity/gpg.py", line 240, in GPGWriteFile
bytes_to_go = data_size - get_current_size()
File "/usr/local/lib/python2.5/site-packages/duplicity/gpg.py", line 232, in get_current_size
return os.stat(filename).st_size
OSError: [Errno 2] No such file or directory:'/tmp/duplicity-gM4CN9-tempdir/mktemp-iZknw0-2'
Odd that it can't read the temporary folder that it created. Changing the folder location also did not work. The solution is to create a separate user for only backups. The can be an issue if you have files that cannot be read by all users and need backup, but I found in my case this worked for the specific files that needed to be saved.
usermod -G nogroup dpbackup
mkdir /home/dpbackup/log
Make sure to add the new user to the deny list in SSH with DenyUsers dpbackup in the file /etc/ssh/sshd_config; there isn't any reason for it to log in.
Now su as this new user. A GPG key needs to be created so that the compressed backups can be encrypted and signed. This way no one else that may have access to our S3 account (Amazon employees) can read the data.
$ cd
$ gpg --list-keys
gpg: directory `/root/.gnupg' created
gpg: new configuration file `/root/.gnupg/gpg.conf' created
gpg: WARNING: options in `/root/.gnupg/gpg.conf' are not yet active during this run
gpg: keyring `/root/.gnupg/pubring.gpg' created
gpg: /root/.gnupg/trustdb.gpg: trustdb created
$ gpg --gen-key
There will be a series of questions, most of the defaults are fine.
- Choose option 1 for DSA and Elgamal (the default)
- Choose the default key size of 2048
- Leave the default that the key will not expire, option 0
- Enter a User ID, Email address, and comment for the key.
- Type O for OK to accept.
- Enter a long passphrase for the key and allow it to be generated. I usually do at least 20 characters since the password will just sit in a script anyway.
Move the keys to some other safe place so that they can't be lost. No key means the backups are worthless. Typically a second backup source is a good idea.
$ chmod 600 gpg_keys.tar
See sample scripts below for backup jobs.
Installation on Debian Lenny 5.0
The Debian install is a little bit simpler and can run the backup job as root inside cron. Get some install packages first:
Install Duplicity:
wget http://code.launchpad.net/duplicity/0.6-series/0.6.06/+download/duplicity-0.6.06.tar.gz
tar zxvf duplicity-0.6.06.tar.gz
cd duplicity-0.6.06
python setup.py build
python setup.py install
Creating a user is optional, but good security practice for it not to be root.
mkdir /home/dpbackup/log
Make sure to add the new user to the deny list in SSH with DenyUsers dpbackup in the file /etc/ssh/sshd_config; there isn't any reason for it to log in.
Now su as this new user. A GPG key needs to be created so that the compressed backups can be encrypted and signed. This way no one else that may have access to our S3 account (Amazon employees) can read the data.
$ cd
$ gpg --list-keys
gpg: directory `/root/.gnupg' created
gpg: new configuration file `/root/.gnupg/gpg.conf' created
gpg: WARNING: options in `/root/.gnupg/gpg.conf' are not yet active during this run
gpg: keyring `/root/.gnupg/pubring.gpg' created
gpg: /root/.gnupg/trustdb.gpg: trustdb created
$ gpg --gen-key
There will be a series of questions, most of the defaults are fine.
- Choose option 1 for DSA and Elgamal (the default)
- Choose the default key size of 2048
- Leave the default that the key will not expire, option 0
- Enter a User ID, Email address, and comment for the key.
- Type O for OK to accept.
- Enter a long passphrase for the key and allow it to be generated. I usually do at least 20 characters since the password will just sit in a script anyway.
Move the keys to some other safe place so that they can't be lost. No key means the backups are worthless. Typically a second backup source is a good idea.
$ chmod 600 gpg_keys.tar
See sample scripts below for backup jobs.
Sample Backup Scripts
The first portion of the script defines the variables we'll need to use. The AWS keys are defined for you when you sign up for S3. Passphrase is the GPG passphrase set on the key generated from gpg --gen-key. The S3 bucket should be fairly unique, so I use the host name of the server. The others are pretty obvious but will be explained later.
# Variables
export AWS_ACCESS_KEY_ID=ABABAB3333338888WWWW
export AWS_SECRET_ACCESS_KEY=BBBBBBBBBBTTTTTTTTTT8888888888VVVVVVVVVV
export PASSPHRASE=somelongpassphrase
DBHOST='dbserver1'
TIMESTAMP=`date +%m%d%Y%H%M`
FILE_PREFIX_DB='mydb_'
FILE_PREFIX_SVN_REPO='repo_'
GPG_PUB_KEY='AAEE66BB'
BACKUP_LOG_FILE='/home/dpbackup/log/s3_backup.log'
FULL_IF_OLDER_THAN='7D'
KEEP_MAX_SETS='2'
S3_BUCKET='serverhostname'
CURRENT_HOST='server-hostname'
TO_EMAIL='sysadmin@example.com'
Just some sample backup methods for MySQL or Subversion if needed.
/usr/local/bin/svnadmin dump /home/svn/repo > /home/dpbackup/svn/$FILE_PREFIX_SVN_REPO$TIMESTAMP.svnbk
This is only necessary on OpenBSD since it's a security feature. We open it up now from 128 and close it back down later.
ulimit -n 1024
Most of these options can be read in the man page of Duplicity, and there are many more to choose from. Basically this backup is going to do a full backup ever 7 days (from the $FULL_IF_OLDER_THAN variable), and use encryption with the highest bzip compression, before sending it to S3. It will write a fresh backup log to the defined file, which we'll email out later.
/usr/local/bin/duplicity --s3-use-new-style --tempdir /home/dpbackup --full-if-older-than $FULL_IF_OLDER_THAN --encrypt-key "$GPG_PUB_KEY" --sign-key "$GPG_PUB_KEY" --gpg-options='--compress-algo=bzip2 --bzip2-compress-level=9' --include /etc/apache2 --include /home/dpbackup/svn --include /home/dpbackup/mysql --exclude '**' / s3+http://$S3_BUCKET > $BACKUP_LOG_FILE
This line just gives us some space in the log file; really it's just for email formatting.
echo -e '\n\n==== REMOVE OLD BACKUP SETS ====\n\n' >> $BACKUP_LOG_FILE
This command will check how many full backup sets are already on S3, and remove any more than what is defined in KEEP_MAX_SETS.
/usr/local/bin/duplicity remove-all-but-n-full $KEEP_MAX_SETS s3+http://$S3_BUCKET >> $BACKUP_LOG_FILE
Again, for formatting purposes.
echo -e '\n\n==== CURRENT FILES IN BACKUP SET ====\n\n' >> $BACKUP_LOG_FILE
This command lists out the current files in our backup set so they can be reviewed in the email, making sure everything is working out it should.
/usr/local/bin/duplicity list-current-files s3+http://$S3_BUCKET >> $BACKUP_LOG_FILE
Now we can mail out the log file. The -s flag is for the subject line, and the TO_EMAIL is defined in our variables. We're just writing the log file as the body of the email.
mail -s "$CURRENT_HOST Backup Log for $TIMESTAMP" $TO_EMAIL < $BACKUP_LOG_FILE
Since we exported the keys and passphrases, we want to make sure we don't leave those around any longer than we have to; set them null.
export AWS_ACCESS_KEY_ID=
export AWS_SECRET_ACCESS_KEY=
export PASSPHRASE=
Just a little clean up so we don't waste space.
rm /home/dpbackup/mysql/*
rm /home/dpbackup/svn/*
This is for OpenBSD only. Since we opened the open file limit up at the beginning of the script, close it back down.
ulimit -n 128
End it.
exit 0






November 16th, 2010 - 17:25
Thanks for the post. It helped a lot!
FYI, I had to install some build packages in order to install Duplicity for Debian Lenny.
apt-get install build-essential
apt-get install librsync1 librsync-dev python-gnupginterface ncftp python-pexpect python-dev
November 24th, 2010 - 10:21
Much appreciated, added them to the list!