The problem#
When doing an incremental backup, any moved file on the source
filesystem usually results in recopying the file to the destination
filesystem. For a large file this can both be slow and possibly waste
space if the destination keeps around deleted files (e.g. ZFS holding on
to old snapshots). If both sides are ZFS, then you can get
zfs send
/recv
to handle all of the
details efficiently. But if only the source filesystem is ZFS or the ZFS
datasets are not at the same granularity on both sides, that doesn't
apply.
zfs diff
gives the information about file moves from a
snapshot, but its output format is a little awkward for scripting.
The solution#
Download the script I wrote, zfs-diff-move.sh
and run it like
zfs-diff-move.sh /path/ /tank/dataset/ tank/dataset@base @new
The following is an abbreviated version of it:
#!/bin/bash
zfs diff -H "$3" "$4" | grep '^R' | while read -r line
do
get_path() {
path="$(echo -e "$(echo "$line" | cut -d$'\t' "-f$3")")"
echo "${path/#$2/$1}"
}
from="$(get_path "$1" "$2" 2)"
to="$(get_path "$1" "$2" 3)"
mkdir -vp -- "$(dirname "$to")"
mv -vn -- "$from" "$to" || echo "Unable to move $from"
done