The problem#
For my recent posts on ZFS, I wanted to quickly try out a bunch of variants of my proposed operations without worrying about accidentally modifying my real ZFS filesystems. Specifically, I wanted to know which ways of copying files would result in more efficiently reusing blocks from existing snapshots where possible.
The solution#
WARNING: The instructions below will modify the ZFS pool tank
,
which is the default name used in many ZFS examples, and therefore may
be a real ZFS pool on your computer.
I strongly recommend doing all of this inside a VM to be sure you are not affecting any real filesystems. I used a VirtualBox VM that I installed Debian on and used the guest additions to share a directory between the VM and my actual machine.
First create a 1 GiB virtual (i.e. in a file instead of a physical device) ZFS pool to run tests on:
fallocate -l 1G /root/tank
zpool create tank /root/tank
Then perform various filesystem operations and inspect the result of
zfs list -o space
to determine if they were using more (or less)
space than you expect. In order to make sure I was being consistent
and make it easier to test out multiple variations, I wrote
some scripts:
git clone https://git.aweirdimagination.net/perelman/zfs-test.git
cd zfs-test/bin
# dump logs from create-/copy-all- and-measure into ../logs/
./measure-all
# read ../logs/ and print space used as Markdown table
./logs-to-table --links
Create script | orig | rsync-ahvx | rsync-ahvx-sparse | rsync-inplace | rsync-inplace-no-whole-file | rsync-no-whole-file | zfs-diff-move-then-rsync |
---|---|---|---|---|---|---|---|
empty | 24K | 24K✅ | 24K✅ | 24K✅ | 24K✅ | 24K✅ | 24K✅ |
random-1M-file | 1.03M | 1.03M✅ | 1.03M✅ | 1.03M✅ | 1.03M✅ | 1.03M✅ | 1.03M✅ |
zeros-1M-file | 24K | 1.03M❌ | 24K✅ | 1.03M❌ | 1.03M❌ | 1.03M❌ | 1.03M❌ |
move-file | 1.04M | 2.04M❌ | 2.04M❌ | 2.04M❌ | 2.04M❌ | 2.04M❌ | 1.04M✅ |
edit-part-of-file | 1.16M | 2.04M❌ | 2.04M❌ | 2.04M❌ | 1.17M✅ | 2.04M❌ | 1.17M✅ |
The details#
Creating the VM#
To completely isolate my system, I did all of my ZFS operations inside a VM, partially because I have had the ZFS drivers crash on my main machine, forcing a reboot, and I didn't want to risk that when I knew I would be intentionally doing weird things. I used VirtualBox and made a minimal Linux install using Debian netinst. As everything I wanted to do was local, once I installed the packages I needed1, I disconnected the network interface and used the guest additions to share a directory between the VM and my actual machine.
Making a virtual disk#
fallocate
with the -l
option creates an empty file
of the specified size, which we can use as a virtual hard drive. On
modern filesystems, it will do so without actually writing that many
zeros.
Notably, the version of the VirtualBox guest additions I used do not count as a "modern filesystem" for this purpose:
$ fallocate -l 1G tank
fallocate: fallocate failed: Operation not supported
As a workaround, I just ran the fallocate
command on the host
instead. Alternatively, you can use the -x
(--posix
) option, but, as
promised by the documentation, it's slow: it took 13 seconds to create a
1 GiB file on my machine.
Experiment framework#
I wanted to test both creating files and modifying or moving them around while taking snapshots in the middle as well as copying the result of doing so onto another filesystem.
-
create-test-setup/
contains scripts for setting up thetest/tank
dataset. They're run with the wrapper scriptcreate-and-measure
which deletestank/test
if it already exists and creates it, and also creates a snapshottank/test@final
and prints out the space used. -
copy-snapshot/
contains scripts for copying files from a snapshot oftank/test
intotank/target
.copy-all-and-measure
runs one of those for every snapshot oftank/test
to copy it to a cleanly createdtank/target
dataset and prints out the space used.
For example, to use the move-file
creation script
(which creates a file, makes a snapshot, and moves the file) with
the rsync-ahvx
copy script (which just does a simple
rsync
to copy), you would run the following:
$ cd zfs-test/bin
$ ./create-and-measure create-test-setup/move-file
[...]
$ ./copy-all-and-measure copy-snapshot/rsync-ahvx
[...]
NAME AVAIL USED USEDSNAP USEDDS USEDREFRESERV USEDCHILD
[...]
tank/target 827M 2.04M 1.02M 1.03M 0B 0B
tank/test 827M 1.04M 14K 1.03M 0B 0B
And you would observe at the end of the output the tank/test
has only
1M used while tank/target
has 2M used because it has two copies of the
moved file.
Timing#
One surprise I ran into was rsync
not copying files that I
knew I had changed. It turns out that rsync
ignores timestamp
differences under a second by default. And, unsurprisingly, my scripts
to construct synthetic filesystems using touch
, mv
, and
fallocate
ran in milliseconds. After some confusion looking at
very similar modification times in the output of stat
and
trying to add some very short sleep
calls, I found the
-@-1
/--modify-window=-1
option.
Making the table#
The measure-all
script runs all of experiments and
dumps the logs to files for later viewing/analysis. To avoid the
unnecessary complication of parsing the logs, it also records in
separate files the "USED
" value for the datasets, both in
exact and human-readable form:
zfs list -o used -H tank/target > used_human_readable
zfs list -o used -p -H tank/target > used
zfs list
provides the -H
option for "scripting mode"
that omits headers and uses tabs instead of spaces if multiple columns
are requested. The -p
option is for "parsable (exact) values". And the
-o used
tells it to just give the single column we want.
Then logs-to-table
reads those files to build the Markdown
table, using the human readable files for the cell text and the exact
files for deciding whether to label the cell as ✅ or ❌.
Conclusions#
After all this, what did I actually learn?
--sparse
is needed forrsync
to properly handle sparse files. Although if you run the experiment with compression enabled fortest/target
, you'll see that it doesn't actually matter in practice as you should always enable compression for any real usage of ZFS.- My
zfs-diff-move.sh
script does what I thought and actually does result in saving space. - Getting
rsync
to not waste space when modifying a small part of a large file requires both--inplace
and--no-whole-file
.
One area I didn't cover in these experiments is exactly how these
interact with hardlinks and rysnc
's -H
/--hard-links
option to
preserve them.
-
Okay, I may have reconnected to the internet to download more packages a few times. ↩
Comments
Have something to add? Post a comment by sending an email to comments@aweirdimagination.net. You may use Markdown for formatting.
There are no comments yet.