The problem#
For my recent posts on ZFS, I wanted to quickly try out a bunch of variants of my proposed operations without worrying about accidentally modifying my real ZFS filesystems. Specifically, I wanted to know which ways of copying files would result in more efficiently reusing blocks from existing snapshots where possible.
The solution#
WARNING: The instructions below will modify the ZFS pool tank
,
which is the default name used in many ZFS examples, and therefore may
be a real ZFS pool on your computer.
I strongly recommend doing all of this inside a VM to be sure you are not affecting any real filesystems. I used a VirtualBox VM that I installed Debian on and used the guest additions to share a directory between the VM and my actual machine.
First create a 1 GiB virtual (i.e. in a file instead of a physical device) ZFS pool to run tests on:
fallocate -l 1G /root/tank
zpool create tank /root/tank
Then perform various filesystem operations and inspect the result of
zfs list -o space
to determine if they were using more (or less)
space than you expect. In order to make sure I was being consistent
and make it easier to test out multiple variations, I wrote
some scripts:
git clone https://git.aweirdimagination.net/perelman/zfs-test.git
cd zfs-test/bin
# dump logs from create-/copy-all- and-measure into ../logs/
./measure-all
# read ../logs/ and print space used as Markdown table
./logs-to-table --links
Create script | orig | rsync-ahvx | rsync-ahvx-sparse | rsync-inplace | rsync-inplace-no-whole-file | rsync-no-whole-file | zfs-diff-move-then-rsync |
---|---|---|---|---|---|---|---|
empty | 24K | 24K✅ | 24K✅ | 24K✅ | 24K✅ | 24K✅ | 24K✅ |
random-1M-file | 1.03M | 1.03M✅ | 1.03M✅ | 1.03M✅ | 1.03M✅ | 1.03M✅ | 1.03M✅ |
zeros-1M-file | 24K | 1.03M❌ | 24K✅ | 1.03M❌ | 1.03M❌ | 1.03M❌ | 1.03M❌ |
move-file | 1.04M | 2.04M❌ | 2.04M❌ | 2.04M❌ | 2.04M❌ | 2.04M❌ | 1.04M✅ |
edit-part-of-file | 1.16M | 2.04M❌ | 2.04M❌ | 2.04M❌ | 1.17M✅ | 2.04M❌ | 1.17M✅ |