How to create incremental backups using rsync on Linux - LinuxConfig.org

In previous articles, we already talked about how we can perform local and remote backups using rsync and how to setup the rsync daemon.


This is a companion discussion topic for the original entry at https://linuxconfig.org/how-to-create-incremental-backups-using-rsync-on-linux

This is such a great article. Shows the simplicity of how rsync can be used to backup. I have couple of questions.

  1. I want to run it so it backs up my complete file system starting with ‘/’ including all users and root (and maybe exclude tmp,cache,etc). What is the strategy. Should I run it as sudo? Or is there some other strategy.
  2. Will all files carry over the same perms, u,g,o, acl and timestamp?
  3. Does --delete remove the files from the older ‘snapshots’ as well? Basically I would want the deleted files kept if I go back to an older snapshot.

Thanks

Hi Silocoder,

Welcome to our forums.

1.) Your backup plan depends on your needs and your plan on recovery. Let’s say the original machine gets a HDD error, and the data on the disk can not be recovered. So you take your latest backup for recovery, and… Would you like to restore the whole system, or reinstall a clean OS, and restore user data only?
I can tell you a personal example: I have a (few) postgreSQL databases, small ones, but the data is valuable. So I make regular backups, and rsync them to remote locations, because the environment itself isn’t that problematic to re-create - so my backups are small and I can restore them anywhere. So I don’t need a full backup of the entire filesystem, only this small portion. This is just one use case, it entirely depends on needs.
2.) Yes, all file permissions and timestamps are carried.
3.) If you would like to keep old “snapshots”, you can always backup to another destination - like another directory created for every backup, maybe based on time of backup.

Hi Silcoder,
I’m really glad you found the article useful. About your questions:

  1. If you want to backup the whole system using rsync, you need to run the program with root privileges. Using sudo it’s usually the recommended way to do it. Creating a backup of a running system, however, is usually not recommended; this depends on what you are using the system for. If there are not a lot of processes which write very often to the disk, for example, it should be ok. Alternatively, you can create a snapshot and backup from it.
  2. The rsync -a option is a shortcut for running the program with the -rlptgoD options. The -p option (short for --perm) preserves the majority of permissions but not all. ACLs and extended attributes are not included. To preserve those you should use the -A (–acls) and -X (–xattrs) options.
  3. Using --delete causes files which don’t exist in the source, to be deleted in the destination, to create an exact copy. In this context, those files will not be deleted from the directory used as the argument of the --link-dest option; they will simply not be hard linked from it to the new backup.

Hi, won’t the script ALWAYS create full backups?
The BACKUP_PATH is always different (to the nearest second). Therefore rsync will be syncing with an empty directory always, and thus perform a full backup every time.

could be an idea to look at rdiff-backup which is based on rsync.
Handles efficient rolling back to backup #1,2,3 relative to “now”
Better than hot water :slight_smile:
google rdiff-backup

Jens

SOURCE_DIR -> The directory to backup - rsync source
LATEST_LINK -> The directory passed as argument to the --link-dest option
BACKUP_PATH -> The path of the new backup directory - rsync dest

Files in SOURCE_DIR which are unchanged when compared to files in the LATEST_LINK directory are hard linked to BACKUP_PATH.

Files that changed and new files are copied from SOURCE_DIR to BACKUP_PATH. In BACKUP_PATH you will always have all the files, but you will save space since unchanged files will be hard linked from the previous backup.

After each backup is made the old LATEST_LINK is removed and a new one is created which points to the latest made backup.

1 Like

@EgDoc
All understood thanks to you!

This is very useful, thanks!
But I wasted some time before realizing that FAT32 and exFAT file systems don’t support soft links.
If anyone else has the same problem, use NTFS (also supported by Windows) or HFS+ (also supported by MacOS) or ext4.

I have been using this strategy for quite some time, but without fully understanding it. Thanks for your detailed explanation. I do have one question though. How do you copy such a ‘backup set’ to another machine? I currently have many of these ‘backup sets’ on a CentOS7 box and want to move/copy them to a TrueNAS (FreeBSD) box, retaining the exact same structure as the source.

Hi Scottthepotter,

Welcome to our forums.

If you have a dedicated NAS machine, you could configure your CentOS box to mount the remote filesystem, using nfs for example (most NAS devices should support it). That way the remote storage appears in your local directory hierarchy, so all else you have to do is changing the target directory of rsync to point to the remote filesystem.

To mount the remote exported filesystem, you can check our NFS configuration guilde, client configuration part. To configure the NAS to serve NFS, you can check the device’s manual, but if unsure, if you provide us with details about it, such as version number, or options the management software running on it allows, I’m sure we can also help with that as well.

I copied this script because it seems pretty simple and easy to work with but I’m using it to create an incremental backup to a remote location. The thing is: This is NOT making an incremental backup. It’s just making a full backup each time. I’m not sure the script is wrong or it’s me and why.

@ffuentes I went through this, it’s because the rm and ls commands are local, not remote. I changed it to this:

#!/bin/bash

A script to perform incremental backups using rsync

set -o errexit
set -o nounset
set -o pipefail

readonly BACKUP_SVR=“172.16.0.5”
readonly SOURCE_DIR="/raid/content/images"
readonly BACKUP_DIR="/backup/images"
readonly DATETIME="(date '+%Y-%m-%d_%H:%M:%S')" readonly BACKUP_PATH="{BACKUP_DIR}/{DATETIME}" readonly LATEST_LINK="{BACKUP_DIR}/latest"

ssh {BACKUP_SVR} "mkdir -p {BACKUP_DIR}"

rsync -avW --no-compress --delete
{SOURCE_DIR}/" \ --link-dest "{LATEST_LINK}”
–exclude=".cache"
{BACKUP_SVR}:{BACKUP_PATH}”

ssh {BACKUP_SVR} "rm -rf {LATEST_LINK}"
ssh {BACKUP_SVR} "ln -s {BACKUP_PATH} ${LATEST_LINK}"

Thank you for the guide and script! What would be the proper way to restore then? And if I wanted to backup my rootfs under “/” would I use the same procedure for backing up and restoring?

Hi Danran,

Welcome to our forums.

I would suggest you don’t try to backup the filesystem as a whole; there are special parts of it that you couldn’t backup anyway. For example, the /proc subtree holds processes that are quite dynamic, and there is no point to backup them. The same goes to /dev, where devices are, and also /tmp, where temporary files are located.

Thanks for this article. However, whenever I use a symbolic link as the argument for link-dest, I get a does not exist error. Do you have an idea why this could be the case?

Hi Yagus,

Welcome to our forums.

The issue you describe could be a simple path error. How do you define your link-dest attribute? Is it a relative or an absolute path?