Tools

rsync: Master Directory Synchronization on Linux

Maximilian B. 4 min read 169 views

The Swiss Army Knife of File Synchronization

rsync (remote sync) is one of the most versatile and widely used file transfer utilities in the Linux ecosystem. Created by Andrew Tridgell and Paul Mackerras in 1996, rsync uses a clever delta-transfer algorithm that sends only the differences between source and destination files, dramatically reducing the amount of data transmitted over the network.

Whether you are backing up a home directory, mirroring a web server, deploying code, or synchronising terabytes across data centres, rsync is almost certainly the right tool for the job.

How rsync Works

rsync’s delta algorithm operates in three phases:

  1. File list generation — rsync scans source and destination to build a file list with metadata (size, mtime, permissions).
  2. Delta detection — For files that differ, rsync computes rolling checksums on blocks of the destination file and matches them against the source.
  3. Transfer — Only the changed blocks (plus new data) are sent, along with instructions on how to reconstruct the file.

This means syncing a 10 GB file where only 50 MB changed will transfer approximately 50 MB — not the entire 10 GB.

Installation

# rsync is pre-installed on most Linux distributions
rsync --version

# If missing:
# Debian/Ubuntu
sudo apt install rsync

# RHEL/Fedora/Oracle Linux
sudo dnf install rsync

# Arch
sudo pacman -S rsync

Basic Syntax

rsync [OPTIONS] SOURCE DESTINATION

# Key: the trailing slash on SOURCE matters!
# /data/project/  → contents of project directory
# /data/project   → the project directory itself

Essential Examples

Local Sync

# Mirror one directory to another
rsync -av /home/user/documents/ /backup/documents/

# -a = archive mode (preserves permissions, timestamps, symlinks, etc.)
# -v = verbose

Remote Sync via SSH

# Push local files to remote server
rsync -avz /var/www/html/ admin@webserver:/var/www/html/

# Pull remote files to local
rsync -avz admin@webserver:/var/log/nginx/ /backup/nginx-logs/

# -z = compress during transfer

Delete Extraneous Files

# Make destination an exact mirror of source
rsync -av --delete /data/source/ /data/mirror/

# WARNING: --delete removes files from destination that
# no longer exist in source. Use --dry-run first!

Dry Run (Preview Changes)

# See what WOULD happen without making changes
rsync -avn --delete /data/source/ /data/mirror/
# Review output, then run without -n to apply

Commonly Used Options

-a, --archive       Archive mode (equals -rlptgoD)
-v, --verbose       Increase verbosity
-z, --compress      Compress data during transfer
-P                  Combines --progress and --partial
--progress          Show progress per file
--partial           Keep partially transferred files
--delete            Delete extraneous files from destination
--exclude=PATTERN   Exclude matching files
--include=PATTERN   Include matching files
--dry-run, -n       Show what would be done without doing it
-e "ssh -p 2222"    Specify remote shell / custom SSH port
--bwlimit=KBPS      Limit bandwidth (KB/s)
--backup            Make backups of changed files
--backup-dir=DIR    Directory for backup copies
--log-file=FILE     Log transfer details to a file

Advanced Usage

Exclude Patterns

# Exclude node_modules and .git directories
rsync -av --exclude="node_modules" --exclude=".git" \
  /projects/myapp/ /backup/myapp/

# Use an exclude file
cat > /etc/rsync-exclude.txt <<EOF
node_modules/
.git/
*.tmp
*.log
.DS_Store
EOF

rsync -av --exclude-from=/etc/rsync-exclude.txt \
  /projects/ /backup/projects/

Bandwidth Limiting

# Limit to 5 MB/s (useful for production servers)
rsync -avz --bwlimit=5000 /data/big-dataset/ remote:/backup/
# Create space-efficient incremental backups
# Each backup looks like a full backup but uses hard links
# for unchanged files

DEST="/backup/$(date +%Y-%m-%d)"
LATEST="/backup/latest"

rsync -av --delete \
  --link-dest="$LATEST" \
  /data/important/ "$DEST/"

# Update the "latest" symlink
ln -sfn "$DEST" "$LATEST"

Rsync Daemon Mode

# For high-performance local network transfers
# without SSH overhead

# On the server — /etc/rsyncd.conf:
[backups]
  path = /srv/backups
  comment = Backup repository
  read only = no
  auth users = backupuser
  secrets file = /etc/rsyncd.secrets

# Start the daemon
sudo systemctl enable --now rsyncd

# From client:
rsync -av /data/ backupuser@server::backups/data/

Automated Backup Script

#!/usr/bin/env bash
# Daily rsync backup with rotation
set -euo pipefail

SOURCE="/var/www/"
BACKUP_BASE="/mnt/backups/www"
DATE=$(date +%Y-%m-%d_%H%M)
DEST="${BACKUP_BASE}/${DATE}"
LATEST="${BACKUP_BASE}/latest"
LOG="/var/log/rsync-backup.log"

echo "[$(date)] Starting backup..." >> "$LOG"

rsync -avz --delete \
  --exclude="*.tmp" \
  --exclude="cache/" \
  --link-dest="$LATEST" \
  "$SOURCE" "$DEST/" \
  >> "$LOG" 2>&1

ln -sfn "$DEST" "$LATEST"

# Remove backups older than 30 days
find "$BACKUP_BASE" -maxdepth 1 -type d -mtime +30 -exec rm -rf {} \;

echo "[$(date)] Backup complete: $DEST" >> "$LOG"

rsync vs Other Tools

Feature rsync scp bbcp
Delta transferYesNoNo
Parallel streamsNoNoYes
Incremental backupsYesNoNo
Exclude patternsYesNoNo
Best forBackups & syncQuick copiesBulk speed

Summary

rsync is the gold standard for file synchronisation on Linux. Its delta-transfer algorithm, extensive options, and SSH integration make it indispensable for backups, deployments, and data migration. Master rsync -avz --delete --dry-run and you have a reliable, efficient, and safe synchronisation pipeline for virtually any scenario.

Share this article
X / Twitter LinkedIn Reddit