The problem#
When running an incremental backup with rsync
with the
--progress
flag, it often spends lot of time outputting nothing as it
scans through many unchanged files. If you think of it before starting
the transfer, --info=progress2
or the name2
/skip2
--info
flags
would give more detail, but once the transfer has been going for a
while, you probably don't want to cancel and restart it so you can add
those flags.
The solution#
The documentation and this StackExchange answer
say you can send a SIGVTALRM
signal to rsync
version 3.2.0+ and
it will output its current progress, but that wasn't working for me.
As a workaround, you can use strace
to get a running log
of which files rsync
is looking at, which includes files it skips
without actually opening:
strace --attach="$(pidof rsync)" --trace=openat
(If that's not showing anything, try removing the --trace=openat
filter and seeing if there's other syscalls with paths to filter on.)
Alternatively, this StackExchange answer suggests a way to see the currently open files including their sizes (including directories but not unchanged files being inspected):
watch lsof -p"$(pidof rsync | tr ' ' ',')"
(The same should work for a recursive cp
/mv
/rm
.)
Similarly, for getting the status of a transfer of a single large
file, this answer attempts to read the files cp
is
reading/writing to give a running percentage of how much it has copied;
a similar approach might work for rsync
.
The details#
SIGINFO
on BSD and Mac OS X#
On BSD (and inherited by Mac OS X), sending SIGINFO
, which by default
is sent to the current process by typing Ctrl+T,
causes many utilities to output a status line,
including rsync
and cp
. For whatever reason, Linux
does not have this feature. Instead on Linux,
rsync
has SIGVTALRM
as an alternative while dd
uses
SIGUSR1
and cp
seems to just not have an
equivalent feature.
strace
#
strace
logs all of the system calls made by a process
(or set of processes). Since any filesystem operation involves a system
call, that's a sufficient granularity to get the information we want.
Note that without the filter it will also dump the contents of any
file writes, which is likely much more detail than you want, although
not relevant if rsync
is just inspecting files and deciding to
not actually do any transfers. I selected the filter just by running it
without the filter and seeing which lines seemed useful.
The documentation explicitly states the --attach
option
can accept a whitespace-delimited list of PIDs in order to directly
accept the result of pidof
or pgrep
. Also
note that when using --attach
, killing strace
doesn't kill the
process it's watching, so it's safe to attach for a few seconds and then
kill strace
once you've verified there's really some work still
being done.
lsof
#
lsof
lists the files currently open on a filesystem or
by a process. It doesn't continuously log open files, so it has to
be combined with watch
to display a snapshot of the
currently open files every couple seconds. Notably when using this to
watch an rsync
task scanning through a lot of files it wasn't
transferring, about half the time it showed no open files, so it may
be less useful than the strace
command for that use case. On the
other hand, it will show the size of the open files, which effectively
gives a rough status of large file transfers.
Unlike strace
, lsof
requires multiple PIDs to be
comma-separated, so my command above has to use tr
to replace the
spaces with commas.
Comments
Have something to add? Post a comment by sending an email to comments@aweirdimagination.net. You may use Markdown for formatting.
There are no comments yet.