Backup Tools

Posted: Jul 2, 2024 Updated: Jul 4, 2024

Tags: linux backup restic borg

Table of Contents

  1. Overview
  2. Features
  3. Installation
  4. Special Considerations for the Maintenance Script
  5. License

Overview

Backup Tools is a pair of scripts for creating and managing append-only backups made with Borg and Restic. It supports automatically backing up a specified set of files, per-host exclusions, error reporting, and via a second script, periodic verification and compaction of the backups.

Why Append-only backups?

In short, so a malicious person or group, like a ransomware crew, can’t easily delete or alter your backups. Append only backups work by having the backup server disallow operations that would result in a backup snapshot being deleted. If these scripts and the systems running them are configured correctly, the systems should only have keys to the backup servers that allow writing new snapshots.

That’s the theory, anyway. Getting this right requires careful configuration of the backup servers and your client devices as well as a healthy dose of good opsec mostly covered in the Special Considerations for the Maintenance Script section.

Why Two Backup Systems?

I use Borg and Restic to guard against the possibility of a bug in my backup software rendering my backups useless. I’ve encountered a bunch of backup failures over the years. My first “real” job was helping an ISP recover a system that had been hacked, where the hacker tried to delete all of the files off of the system (they did the classic rm -rf / thing, which blew up halfway through when libc got deleted and the kernel needed to page something from it into memory.) The company had backups, on a QIC tape, and we were able to restore some of the files from tape that way. There were many failures. Other files we were able to find recent-ish copies of on the filesystem that hadn’t yet been deleted, others we recreated from scratch. In any case, in those days, the main risk of backup failure was the backup media itself - the software used (tar or cpio) was simple and so was the data format. These days, the media (both tape and disk) is very reliable, but the data formats are very complex. So, I consider backup software bugs to be at least as substantial a risk as media failure these days. Thus, two backup programs and two formats.

Features

Installation

To use the backup script as-is, you need two remote backup servers: one with Borg 1.x installed, and the other with Rclone and Restic. Both also need an SSH server. You can (and I do) use a third-party backup service for one or both of these, as long as it supports Borg over SSH or Restic/Rclone over SSH. Using Rclone with Restic is necessary to support append-only backups – Restic doesn’t support append-only backups natively. Borg does. I don’t describe detailed server setup here, mainly because there isn’t much to it. You just need to install the packages for Restic and Rclone on one server, and Borg on another. The package manager for most popular distros should have all three of these.

Download the backup scripts

Backuptools 0.11: backuptools-0.11.tar.gz

SHA-256: 9299e29f2573ac0ab4948d5c48865957f3819fae4512449c01aabffdc40ee238

SHA-512: 19a0602bd1402c9404b767febb7d48cf754ef0f08eae92b48e5a70b4e50beb6da362ca5f3aa54b789c152de29eda2ee399582f8a0d4974c855aca82f4a2a3257

GPG Signature: backuptools-0.11.tar.gz.sig (Key)

Install the backup scripts

Copy the files backup.sh and backup-settings to your servers. I’d recommend copying backup.sh file to /usr/local/bin. backup-settings needs to be in /etc. Next, set permissions:

# chmod +x /usr/local/bin/backup.sh
# chown root:root /etc/backup-settings
# chmod 0600 /etc/backup-settings

SSH Keys

For each of your backed-up systems, you’ll also need an SSH key to authenticate your host to the backup servers. A unique key must be generated for and stored on each backed-up system. The private key should not be stored on anywhere else.

Since the backup script needs to run with root privileges, this should be done as root:

# ssh-keygen -t ed25519 -f host1_ed25519 -P ""
Generating public/private ed25519 key pair.
Your identification has been saved in foo_ed25519
Your public key has been saved in foo_ed25519.pub
The key fingerprint is:
SHA256:DmFJHUZthmlOIo8Z7M/+/S4IKEiJ4qJv47fR07Wm0bY marcusb@dom
The key's randomart image is:
+--[ED25519 256]--+
|   .  .o+=       |
|    +..o* +      |
|. .. *+= o       |
|oo  +....        |
|+.   +. S .      |
|o.. ..++ o .     |
|o  ...o.+.=      |
|. o .....*..     |
| +oo.. .o E+o    |
+----[SHA256]-----+

Next, you’ll need to add your public key to the authorized_keys file on your backup servers. For the Borg server, the entry will look like this:

# dom append-only key
restrict,command="borg serve --append-only --restrict-to-repository backup/path/host1" ssh-ed25519 AAAAXXXXXXXXXX.... root@host1

Special note if you are using rsync.net. They have Borg 1.x installed as ‘borg1’, so the command above needs to be ‘borg1 server –append-only …’

The entry for the Restic server should look like this:

# Dom append-only key
restrict,command="rclone serve restic --stdio --append-only backup/path/host1" ssh-ed25519 AAAAXXXXXXXXX.... root@host1

For both entries, make sure the path is appropriate for your server.

These entries are the crux of the append-only scheme; without the restriction commands, your backups will not be append-only. This also means that if there is any way to authenticate to the SSH server (or write to its filesystem) using other keys or methods, the backup scheme is potentially compromised. For my systems, the only way to access the backup servers remotely is via SSH (no other services listen on the network) and the only keys listed in the authorized_keys file are:

  1. A restricted key for each backed-up host (as listed above)
  2. An administrative key that only exists on my maintenance host
  3. My Yubikeys, which I have to physically touch in order to trigger an SSH authentication.

So, theoretically, the only way to overwrite the backups is with my administrative key or the Yubikeys. For my personal data, I consider that an acceptable risk. Please consider your risk profile carefully. It may well be appropriate to, for instance, only have the administrative key listed and do all maintenance, including adding entries to the authorized_keys and restoring from backup from the maintenance host. Or, it might make sense to get an extra set of Yubikeys that you only connect when you are doing a restore.

Repo Initialization

Next, you’ll need to generate a random passphrase for each system and initialize the backup repositories. Like the SSH key, the passphrase should be unique for each system being backed up. It must be stored in only two places: 1) in the backup-settings file on the system being backed up and 2) in your password/secrets manager. It is important to not lose this passphrase. It is used to encrypt your backup data at rest; without it, you cannot restore from backup (or do anything else to your backup repositories.)

On your admin workstation, initialize the repositories:

Borg:

$ borg init --encryption=repokey user@borghost:/path/to/repo
Enter new passphrase:
Enter same passphrase again:
Do you want your passphrase to be displayed for verification? [yN]: n

By default repositories initialized with this version will produce security
errors if written to with an older version (up to and including Borg 1.0.8).

If you want to use these older versions, you can disable the check by running:
borg upgrade --disable-tam ssh://user@borghost/path/to/repo

See https://borgbackup.readthedocs.io/en/stable/changes.html#pre-1-0-9-manifest-spoofing-vulnerability for details about the security implications.

IMPORTANT: you will need both KEY AND PASSPHRASE to access this repo!
If you used a repokey mode, the key is stored in the repo, but you should back it up separately.
Use "borg key export" to export the key, optionally in printable format.
Write down the passphrase. Store both at safe place(s).

Go ahead and export your repo key and store it in a safe place (your password manager):

$ borg key export user@borghost:/path/to/repo
BORG_KEY blahblahblah.....................

Restic:

$ restic -r 'sftp:user@restic-host:/path/to/repo' init
enter password for new repository:
enter password again:
created restic repository e5ff21aecd at sftp:user@restic-host:/path/to/repo

Please note that knowledge of your password is required to access
the repository. Losing your password means that your data is
irrecoverably lost.

Configure your Borg Repo location in the /etc/backup-settings file

With Borg, you’ll need to configure the path to your backup repo in the /etc/backup-settings file. The host should be ‘borg-backup’, but after the hostname, you’ll need to enter the path. For instance, if you initialized your repo in /backups/host1, set the BORG_REPO variable to ‘ssh://borg-backup/backups/host1’.

This is not necessary for your Restic repo.

Configure your SSH Host aliases

The backup scripts expect the repositories to live on ‘borg-backup’ and ‘restic-backup’. Edit your /root/.ssh/config file and add aliases to your real hosts, setting the hostnames, usernames, and the path to the SSH private key you generated earlier.

Host borg-backup
     Hostname BorgBackup.example.com
     User borg_backup_user
     IdentityFile ~/.ssh/host1_ed25519
Host restic-backup
     Hostname ResticBackup.example.com
     User restic_backup_user
     IdentityFile ~/.ssh/host2_ed25519

Configure your backup exclusions

By default, the script backs up everything in /etc, /home, /var, and /root. You can edit the borg and restic invocations in the script if you want to backup additional locations on disk. If there are things you don’t want to backup, place them in /etc/backup-exclusions. Otherwise, just create an empty file in /etc/backup-exclusions. There is an example file included that shows the syntax.

Run your first backup.

Edit /etc/backup-settings and set SHOW_OUTPUT to ’true’. Next, run /usr/local/bin/backup.sh. This should start your first backup. You can watch the output to make sure your backup is successful, although depending on the amount of data you are backing up and the available bandwidth to your backup hosts, this could take some time. The output will look something like this:

# /usr/local/bin/backup.sh
Tue Jul 2 08:58:57 CDT 2024 Starting Borg backup
Tue Jul 2 08:58:57 CDT 2024
Creating archive at "ssh://borg-backup/./borg/terra::terra-2024-07-02T08:58:57"
Synchronizing chunks cache...
M /etc/backup-settings~                                                                                      08:58:53 [14/1870]
M /etc/backup-settings
A /var/log/wtmp
M /var/log/lastlog
------------------------------------------------------------------------------
Repository: ssh://borg-backup/./borg/terra
Archive name: terra-2024-07-02T08:58:57
Archive fingerprint: 9ebe52e9619d12cd7871b656053234e933b392d0d5f973d6f96f25e780cd43e3
Time (start): Tue, 2024-07-02 08:58:59
Time (end): Tue, 2024-07-02 08:59:01
Duration: 2.41 seconds
Number of files: 844
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
Original size Compressed size Deduplicated size
This archive: 18.09 MB 11.44 MB 10.19 kB
All archives: 629.25 MB 401.03 MB 24.15 MB

Unique chunks Total chunks
Chunk index: 1590 29079
------------------------------------------------------------------------------
terminating with success status, rc 0
Tue Jul 2 08:59:03 CDT 2024 Starting Restic backup
Tue Jul 2 08:59:03 CDT 2024
Tue Jul 2 08:59:05 CDT 2024
open repository
lock repository
using parent snapshot acbc9ce8
load index files
start scan on [/etc /home /root /var]
start backup on [/etc /home /root /var]
scan finished in 0.272s: 844 files, 17.254 MiB

Files: 0 new, 4 changed, 840 unmodified
Dirs: 0 new, 6 changed, 243 unmodified
Data Blobs: 3 new
Tree Blobs: 6 new
Added to the repository: 481.852 KiB (17.685 KiB stored)

processed 844 files, 17.254 MiB in 0:00
snapshot cb278fa1 saved

After you are sure the backup script is working, set SHOW_OUTPUT to false (if an error is detected, the output for the entire backup session will be written to stdout.)

Schedule your backups automatically.

The only thing left to do is set your cron job or systemd timer to run the backups automatically. Cron has the advantage of sending emails when a job generates output (which our script will only do when an error occurs.)

Cron:

# cd /etc/cron.hourly
# ln -sf /usr/local/bin/backup.sh 1backup

It is a good idea to let the first scheduled backup run with SHOW_OUTPUT set to true, that way you can be sure your cron email notification is working correctly.

If you want to go the systemd route, the timer would look something like this:

[Unit]
Description=Timer for backup script

[Timer]
OnCalendar=hourly
Persistent=true
Unit=backup.service

[Install]
WantedBy=timers.target

and your service:

[Unit]
Description=Service for backup script

[Service]
Type=oneshot
ExecStart=/usr/local/bin/backup.sh
User=root

Both files should be in /etc/systemd/system/

Enable with:

# systemctl enable --now backup.timer

While this works well, you’ll have to add email reporting into the backup script if you want to be notified of any failures.

Special Considerations for the Maintenance Host

In addition to the backup script we configured above, there is a second script: backup-maintenance.sh, which performs several important maintenance and validation tasks:

  1. Deleting old snapshots so the backups don’t grow uncontrollably
  2. Verifying the integrity of the backups
  3. Reporting if one or more configured systems haven’t uploaded a snapshot recently (by default, 1 day, although this can be configured to a higher value for systems that are not always online, like laptops.)

Unfortunately, to do these things requires:

  1. A copy of the repo encryption passwords, which we do not want to store on the remote backup servers
  2. Authentication keys that allow us to delete snapshots – which we do not want on any of the systems being backed up, since that would eliminate the append-only nature of our backups.

Ultimately, that means you’ll need to either 1) install this script on one of your computers and invoke it manually, authenticating to your backup servers with a removable key to guard against your backups being deleted by an adversary or 2) setting up a very restricted dedicated host to run the script automatically. For many reasons, I recommend option 2. I’m using a Raspberry Pi 31 for this, and for my relatively small backup sets (a half dozen machines and around 10 TB total data backed up) it is just barely acceptable performance-wise. The script takes around three hours to run. Whatever hardware you choose, it is very important that an attacker not be able to compromise this machine in order to maintain the append-only nature of your backups. At a bare minimum, I’d recommend:

  1. Disable all daemons that listen on the network. For a default install, this is probably sshd and systemd-resolved, but run netstat -pan | grep LISTEN to see if there are any others on your system.
  2. Delete all default interactive users and create your own account with a good password.
  3. Install msmtp (for email reports,) restic, borg, and if it isn’t already installed crond.
  4. Create an ssh key for this host and place it in your authorized_keys file on your backup servers. The private key shouldn’t be copied anywhere else. Storing it on a TPM wouldn’t be a bad idea.

To install the script, copy backup-maintenance.sh and backup-maintenance-settings from the backuptools distribution to your maintenance host. This script does not need to run as root. Backup-maintenance-settings should be installed in the home directory of whatever user is going to run the script.

You’ll also need to install msmtp, borg, and restic for the script to operate.

After that, add your backed up systems to ~/backup-maintenance-settings:

# The format is add 'hostname' 'borg repo location' 'borg pass' 'restic repo' 'restic pass' [days]
add 'host1' 'user@borghost:repo/path/host1' 'borg-password' \
    'user@restic-host:repo/path/host1' 'restic-pass'

# If you have a system that is only sporadically online, you can set a longer
# time before the script will complain about not finding new backups, with the
# sixth parameter being in days:
add 'mylaptop' 'user@borghost:repo/path/mylaptop' 'borg-password' \
    'user@restic-host:repo/path/mylaptop' 'restic-pass' 7

From there, set a cron job to run the script once per day. Like the backup script itself, if SHOW_OUTPUT is set to false, it will only display output if an error is detected. Unlike the backup script, if an error is detected, it will show the output related only to that host. If you run this from cron and set the MAILTO variable correctly in your crontab, you should get an email if:

  1. There is an error deleting old backup snapshots
  2. There is an error detected in the repository (at least, that Borg or Restic can detect.)
  3. If a backed up system doesn’t upload at least one new snapshot every day (unless you override this timer.)

License

Backup Tools is released under the MIT License


  1. be sure to disable your swapfile if you use an older Raspberry Pi – Borg, in particular, will gladly go into swap when validating a larger repo and your RPI will lock up, more or less, when running the maintenance script. ↩︎