rsync which stands for remote sync is a powerful and efficient command-line utility used for transferring and synchronizing files and directories between locations whether they are local or remote.

Key functions and features :

  • Efficient transfers: rsync minimizes data transfer by only sending the differences between the source and destination files.
  • Local and remote operations: It can be used to sync files on the same machine or transfer them between a local computer and a remote server.
  • Preserves file attributes: The -a (archive) flag enables a recursive copy that preserves symbolic links, permissions, modification times, and more.
  • Secure transfers: It can use SSH for secure, encrypted connections when transferring files to a remote machine.
  • Backup and mirroring: rsync is ideal for creating backups and keeping directories in sync, including deleting files from the destination if they are removed from the source (with the --delete option).
  • Customizable: It offers numerous options to control the transfer process, such as excluding certain files, showing progress, or compressing data during transfer.

1. delta-transfer algorithm and key differences with cp and scp

rsync notable strength is its delta-transfer algorithm that detects and transfers only updated segments of files since last synchronization.

To greatly minimize network bandwidth consumption and transfer duration in comparison to merely copying whole files with commands such as cp or scp, the delta-transfert algorithm divides files into segments and calculates checksums for each segment.

During synchronization, it checks the checksums of the source file segments against those of the destination file. Only the segments that have been altered or are absent on the destination are sent, which makes it highly efficient for updating large files with minor changes.

2. Basic syntax and options.

rsync syntax is quite easy :

rsync [options] source destination


Option Description
-a Archive mode (recursive, preserves permissions, timestamps, ownerships, symlinks).
-v Verbose output; shows detailed information during transfer.
-z Compress file data during transfer.
-r Recurse into directories.
-h Output numbers in a human-readable format (e.g. 1K, 4.3M).
--delete Delete files at destination that do not exist at source.
--dry-run or -n Show what would be transferred, without making any changes.
-P Show progress and keep partial files.
(Equivalent to --partial + --progress)
--partial Keep partially transferred files instead of deleting them.
--progress Show progress for each file during transfer.
--compress-level=N Specify compression level (0–9). Works with -z.
--whole-file Do not use delta-transfer algorithm; copy whole files.
--ignore-errors Continue deleting files even when there are I/O errors.
--exclude Exclude files or directories matching a pattern.
--include Include files or directories matching a pattern.
--bwlimit=N Limit bandwidth usage (in KB/s).


First create two directories in /home :

mkdir folder1 folder2

Then create some files in folder1 for testing :

for i in {1..5}; do echo "test" > folder1/"file$i.txt"; done

To sync contents of folder1 with folder2, rsync will operate as the copy command:

rsync -r folder1/ folder2
  • -r : recursive option is necessary for directory syncing. It is better to use the -a option (archive mode) as it includes -r and preserves attributes.
  • / : trailing slash means the content of folder1
2.1 Importance of the trailing slash

The same command without the trailing slash copies the entire folder in its destination, so folder1 and its contents will be copied to folder2 :

rsync -r folder1 folder2

This seemingly minor detail can have a significant impact, as it entirely transforms the synchronization results. It is crucial to check the syntax before running a command in a production setting or involving remote destinations, destructive options like --delete, some rules like --exclude --include. The —dry-run option (-n) allows to simulate the transfer without implementing any changes.

3. Local sync of folders or files

3.1 Copy a file
rsync -azv folder1/file1.txt folder2

file1.txt is copied to folder2.

3.2 Copy all contents of source to destination
rsync -azv folder1/ folder2

All the files in folder1 are copied to folder2

3.3 Copy a folder
rsync -azv folder1 folder2

folder1 and all its content is copied inside folder2

3.4 Test a synchronization without making any change
rsync -azvn folder1/ folder2

The output tells what files are to be transferred when rsync will be run for real.

3.5 Filtering files and folders with --exclude

Sometimes we have to exclude certains types of files or some directories from synchronization. To achieve this we use the --exclude option with patterns : Let's create in folder1 two files : test.tmp and serv.log ; a folder soft :

mkdir folder1/soft
cat test.tmp > folder1/test.tmp
cat serv.log > folder1/serv.log

Now, sync contents of folder1 to folder2 excluding newly created files and folder :

rsync -azv --exclude 'serv.log' --exclude 'test.tmp' --exclude 'soft/' folder1/ folder2

The two files test.tmp and serv.log and the folder soft are excluded from transfert. --exclude='soft/' The trailing slash ensures it only matches a directory, not a file named soft.


3.6 Filtering with --include option

The following command will ignore all files and folders except files with .log extension :

rsync -avz --include '*.log' --exclude '*' folder1/ folder2

The --include option is more intricate and operates alongside --exclude. It is frequently utilized to bypass an exclusion for a particular pattern. The --include option must be positioned before the --exclude option in order to achieve this. By reversing the order, the --exclude option would initially apply to all files within the folder1 directory, ensuring they are omitted before any evaluation of the --include option :

  1. rsync checks folder1 for log files. The --include='*.log' rule matches for one file, so rsync marks this file for transfer and stops processing rules for it.
  2. now rsync moves to the next rule, --exclude='*', which does match, since there are other files with other extensions, all of them are excluded.
    3.7 Use exclusion file

In some cases, excluding rules can become numerous and setting or typing them grueling, especially for backup of file system. Using a file to exclude patterns can be helpful :

cat excl.txt
*.txt
soft/
rsync -azvn --exclude-from='excl.txt' folder1/ folder2

All files are synced except those with .txt extension and the folder soft.
Mastering --exclude and --include allows optimization of backups by limiting sync to only useful data.

4. Remote sync of files or folders with or without ssh

4.1 Without ssh

The global syntax keeps the same, as the local directory is the source and the remote system the destination. This example does not use ssh and is not recommended for remote systems, we can use it for large files to potentially improve performance on fast local networks, as it eliminates encryption and decryption steps. We need to set up an rsync daemon on the target machine. This allows rsync to communicate directly using its own protocol and authentication mechanism, which is not encrypted.

On the server side (target machine):

  1. Create or edit /etc/rsyncd.conf: This file configures the rsync daemon. Define modules (export points) and their settings. For each module mandate authentication and define which users are allowed to authenticate.

    # Example /etc/rsyncd.conf
    uid = nobody
    gid = nobody
    use chroot = yes
    max connections = 4
    syslog facility = local5
    pid file = /var/run/rsyncd.pid
    
    [backup]
    path = /media/data
    comment = Backup data
    read only = no # Allow writing to this module
    list = yes
    # Optional: Authentication
    auth users = sc, walid
    secrets file = /etc/rsyncd.secrets
  2. create /etc/rsyncd.secrets: This file contains usernames and their (unencrypted) passwords.
    sc:mypass
    walid:hispass
  3. Start the rsync daemon.
    sudo systemctl enable rsync # Enable on boot
    sudo systemctl start rsync # Start immediately

    On the local machine (source) :

Use the rsync command with the rsync:// protocol:

rsync -azv ~/folder1 rsync://[sc@]target_ip_address/backup/

In this example the directory folder1 is transferred with its contents to the destination_folder. This method is only recommended for trusted local networks where security concerns are minimal.

4.2 with ssh

To backup local data or send modified files to a remote server we can use rsync which secures connection using ssh and reduces bandwidth consumption with its delta-algorithm.

4.2.1 Transfert a single file
rsync -azv ssh ~/folder1/file5.txt test@192.168.0.10:~/home/test
4.2.2 Transfert all contents of source to destination
rsync -azv ssh ~/folder1/ test@192.168.0.10:~/home/test
4.2.3 Transfert the entire folder to destination
rsync -azv ssh ~/folder1 test@192.168.0.10:~/home/test
4.2.4 Transfert all files in the source folder but not subfolders
rsync -zv ssh ~/folder1/* test@192.168.0.10:~/home/test
4.2.5 Using -e flag

In case we need additional options to the ssh command outside of the user and host, such as ssh on non standard 22 port :

rsync -azve 'ssh -p 4800' ~/folder1/ test@192.168.0.10:/home/test
4.2.6 Pull: Syncing from a Remote server to a Local Machine

A "pull" operation collects files from a remote server and copies them onto the local machine. This is advantageous for downloading server logs or acquiring backups. To initiate a pull, we simply need to swap the source and destination in the command. The remote system is now treated as the source (the first argument), while the local system is regarded as the destination (the second argument).

rsync -azv -e "ssh -p 4800" test@192.168.1.10:/var/www/html/ /home/sc/web/html/

5. Advanced options for rsync

The following commands apply to remote destination. rsync uses SSH by default, when transferring files remotely, so there is no need to type ssh command.

5.1 Display transfert state and resume interrupted transfers
rsync -azP folder1/ folder2


5.2 Delete obsolete files from destination : mirroring

By default, rsync does not perform any deletions in the destination directory, but to ensure that two directories remain perfectly synchronized, it is mandatory to eliminate files from the destination directory if they have been removed from the source. We can achieve this by using the --delete option coupled with the --dry-run (-n) to avoid unwanted deletions.

rsync -azPn --delete folder1/ folder2


5.3 Create backups for keeping versions of Files (Versioning)

Rsync does not natively keep a history of changes in the style of a version control system like Git. However, we can use built-in options to save older versions of files before they are overwritten or deleted. By using --backup and --backup-dir, this method moves the old version of a file into a specified backup directory when a new version is transferred or the file is deleted from the source.

The syntax should be :

rsync -azP --delete --backup --backup-dir=/path/to/backup_versions /path/to/source/ /path/to/destination/


As an example, let's create a backup folder, modify the timestamp of 2 files and test the command :

mkdir backup-versions
touch folder1/file{1..2}.txt
rsync -azP --delete --backup --backup-dir=../backup-versions folder1/ folder2

Important Note : The backup directory must be on the same machine as the destination directory. This implies that the path is interpreted relative to the destination, unless an absolute path is used : /home/sc/backup-versions.


5.4 Optimization tweaks

1. --compress-level=N

This option sets the compression level to be used when the -z option is enabled. The N value typically ranges from 0 to 9 for zlib compression, with 0 meaning no compression and 9 representing the highest compression level. The default value with -z option is 6.

The optimal compression level depends on factors such as network bandwidth, CPU availability, and the type of data being transferred. For example, on high-speed networks, the overhead of compression might outweigh the benefits, while on slow links, it can significantly reduce transfer times.

Rsync also avoids compressing already compressed file types like JPEGs or ZIP archives.

rsync -azP --compress-level=2 folder1/ folder2
  1. --whole-file or -W :

In cases where the CPU overhead of delta calculations is a bottleneck and the network bandwidth is ample, this option disables rsync's delta-transfer algorithm, causing the entire file to be transferred, even if only a small portion has changed.

  1. --bwlimit= RATE

The option is used to limit the I/O bandwidth used during a transfer. This is particularly useful when performing network transfers and you want to prevent rsync from consuming all available bandwidth, potentially impacting other network services.

rsync -azP --bwlimit=1000 folder1/ folder2
  • RATE specifies the maximum transfer rate in kilobytes per second (KBPS), not kilobits per second.--bwlimit=1000 would limit the transfer to 1000 kilobytes per second, or approximately 1 megabyte per second.

6. Automating tasks with cron jobs