The problem#
I had recently done an apt upgrade
that included upgrading ZFS
and noticed zpool status
showed a weird "(non-allocating)
" message,
which seemed concerning:
$ zpool status
pool: tank
state: ONLINE
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-*** ONLINE 0 0 0 (non-allocating)
ata-*** ONLINE 0 0 0 (non-allocating)
errors: No known data errors
The solution#
This forum thread suggested the error may be due to a version mismatch between the ZFS tools and the kernel module. I confirmed there was a mismatch:
$ zpool --version
zfs-2.2.3-2
zfs-kmod-2.1.14-1
The easy way to load the new version of a kernel module after an update
is to reboot the computer. But if you don't want to do that, here's the
general outline of the commands I ran to unload and reload ZFS (run as
root
):
# Stop using ZFS
$ zfs umount -a
$ zpool export tank
$ service zfs-zed stop
# Remove modules
$ rmmod zfs
$ rmmod spl
# will show error: rmmod: ERROR: Module spl is in use by: ...
# repeatedly rmmod dependencies until spl is removed.
# Reload ZFS
$ modprobe zfs
$ service zfs-zed start
$ zpool import tank
The details#
Example commands as root
#
As most of the commands in this post require root
, I've omitted
explicitly putting sudo
before every command. I've mostly tried
to write #
to indicate a root
prompt (e.g. from running sudo -i
)
to distinguish from a $
user prompt for commands that do not require
root
, although I did not do so in the previous section because I
wanted lines starting #
to get rendered as comments.
Unmounting filesystems#
zfs umount -a
will try to unmount all ZFS filesystems
and for any filesystem it cannot unmount, it will print a message like
cannot unmount '/mnt/tank/foo': pool or dataset is busy
lsof
can determine what programs are keeping the dataset busy:
$ lsof /mnt/tank/foo
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
bash 3529432 perelman cwd DIR 0,101 6 34 /mnt/tank/foo
That shows there's a bash
shell open to that directory, so simply
changing that shell to a directory not mounted by ZFS is sufficient.
(Or there's the more aggressive solution of using the PID to kill the
process.)
Once all of the ZFS filesystems are unmounted, you can stop the remaining software using ZFS by "exporting" the pool:
# zfs export tank
and unloading the ZFS daemon:
# service zfs-zed stop
Unloading kernel module#
If a kernel module is unused, it's straightforward to remove
with modprobe
or rmmod
:
# modprobe -r zfs
modprobe: FATAL: Module zfs is in use.
Well, not so simple, since we have to remove modules that depend on it
as well. We can find out what modules are using a module by
running lsmod
:1
$ lsmod | grep zfs
zfs 6565888 8
zzstd 15344 2 zfs
znvpair 14643 2 zfs
zcommon 15654 2 zfs
zavl 13354 2 zfs
spl 147456 6 zfs,icp,zzstd,znvpair,zcommon,zavl
If you're unsure what a module is, the modinfo
command
is one source of additional information:
# modinfo spl | grep description:
description: Solaris Porting Layer
Even though there's no "z
" in the name, that seems likely to be only
for ZFS as ZFS was originally part of Solaris.
Unloading related kernel modules, too#
Initially, unloading the zfs
module did not require unloading the
other modules, but then when I tried to load the new version, I got an
error:
# modprobe zfs
modprobe: ERROR: could not insert 'zfs': Invalid argument
Not a super informative error message, but module loading/unloading
often generates messages in dmesg
and I found some
details there:
# dmesg | tail
zfs: disagrees about version of symbol spl_kmem_cache_create
zfs: Unknown symbol spl_kmem_cache_create (err -22)
zfs: disagrees about version of symbol __cv_wait_io_sig
zfs: Unknown symbol __cv_wait_io_sig (err -22)
zfs: disagrees about version of symbol taskq_create
zfs: Unknown symbol taskq_create (err -22)
Seeing the comment about spl_kmem_cache_create
first suggested that I
needed to load the latest version of the spl
module in order to load
the new version of the zfs
module.
Unloading spl
#
Of course, spl
had multiple modules depending on it, so I couldn't
just remove it:
# modprobe -r spl
modprobe: FATAL: Module spl is in use.
rmmod
gave more informative error messages, so I used it to
determine what to actually unload:
# rmmod spl
rmmod: ERROR: Module spl is in use by: icp zzstd znvpair zcommon zavl
# rmmod spl icp zzstd znvpair zcommon zavl
rmmod: ERROR: Module spl is in use by: icp zzstd znvpair zcommon zavl
rmmod: ERROR: Module znvpair is in use by: zcommon
# rmmod zcommon
rmmod: ERROR: Module zcommon is not currently loaded
# rmmod spl znvpair
rmmod: ERROR: Module spl is in use by: znvpair
# rmmod spl
As you can tell from this log, rmmod
doesn't do anything smart
with the list of modules to remove, it just tries them in order, so you
need to give the dependencies first, then the module that depends on
them. Or just rerun the command and get some messages about modules not
being loaded because they've already been removed.
Removing the other stale modules#
Now with the spl
module removed, I got a new error message:
# modprobe zfs
modprobe: ERROR: could not insert 'zfs': Exec format error
I checked and saw spl
was loaded (presumably the new version) and
zfs
was not:
# lsmod | grep spl
spl 163840 0
# lsmod | grep zfs
# lsmod | grep zcommon
Checking dmesg
again, I found
# dmesg | tail -1
zfs: exports duplicate symbol luaL_argerror (owned by zlua)
and noticed zlua
was still loaded:
# lsmod | grep zlua
zlua 233472 0
After rmmod zlua
, I got a similar message about zunicode
:
# dmesg | tail -1
zfs: exports duplicate symbol u8_strcmp (owned by zunicode)
But after that, modprobe zfs
succeeded and I restarted the zfs-zed
service:
# service zfs-zed start
Checking zpool --version
, I confirmed the correct version was now
being used:
$ zpool --version
zfs-2.2.3-2
zfs-kmod-2.2.3-2
Upgrading the pool#
I brought the pool back online:
# zpool import tank
# zpool status
pool: tank
state: ONLINE
status: Some supported and requested features are not enabled on the pool.
The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
the pool may no longer be accessible by software that does not support
the features. See zpool-features(7) for details.
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ata-*** ONLINE 0 0 0
ata-*** ONLINE 0 0 0
errors: No known data errors
and the message about "(non-allocating)
" was in fact gone. But there
was a new message about zpool upgrade
.
This StackExchange answer points out that
zpool get all tank | grep feature@
will list the new features
and man zpool-features
has information on all of the features
(not linking because you may need to run it on your own machine to
get the latest information on features):
# zpool get all tank | grep feature@
tank feature@async_destroy enabled local
tank feature@empty_bpobj active local
tank feature@lz4_compress active local
[...]
tank feature@zilsaxattr disabled local
tank feature@head_errlog disabled local
tank feature@blake3 disabled local
tank feature@block_cloning disabled local
tank feature@vdev_zaps_v2 disabled local
You can also get the list of features to be added by an upgrade by
running zpool upgrade
without the pool name:
# zpool upgrade
This system supports ZFS pool feature flags.
All pools are formatted using feature flags.
Some supported features are not enabled on the following pools. Once a
feature is enabled the pool may become incompatible with software
that does not support the feature. See zpool-features(7) for details.
Note that the pool 'compatibility' feature can be used to inhibit
feature upgrades.
POOL FEATURE
---------------
tank
zilsaxattr
head_errlog
blake3
block_cloning
vdev_zaps_v2
But be careful as the very similar command zpool upgrade tank
will
actually apply the upgrades to the pool tank
:
# zpool upgrade tank
This system supports ZFS pool feature flags.
Enabled the following features on 'tank':
zilsaxattr
head_errlog
blake3
block_cloning
vdev_zaps_v2
If you decide to be careful and not upgrade, you can tell
zpool
to stop nagging you by setting the
compatibility
property.
Alternatively, you can run zpool upgrade tank
to upgrade, but mind
this is not reversible, so don't do it unless you actually want the new
features and are really sure you'll never need to read your ZFS pool
with an older version of ZFS.
-
Sorry, that
lsmod
output is faked based on my recollection as it's no longer within my terminal history and I can't load the old version of the modules to recreate it. ↩
Comments
Have something to add? Post a comment by sending an email to comments@aweirdimagination.net. You may use Markdown for formatting.
There are no comments yet.