A Weird Imagination

Status of long-running copy

The problem#

When running an incremental backup with rsync with the --progress flag, it often spends lot of time outputting nothing as it scans through many unchanged files. If you think of it before starting the transfer, --info=progress2 or the name2/skip2 --info flags would give more detail, but once the transfer has been going for a while, you probably don't want to cancel and restart it so you can add those flags.

The solution#

The documentation and this StackExchange answer say you can send a SIGVTALRM signal to rsync version 3.2.0+ and it will output its current progress, but that wasn't working for me.

As a workaround, you can use strace to get a running log of which files rsync is looking at, which includes files it skips without actually opening:

strace --attach="$(pidof rsync)" --trace=openat

(If that's not showing anything, try removing the --trace=openat filter and seeing if there's other syscalls with paths to filter on.)

Alternatively, this StackExchange answer suggests a way to see the currently open files including their sizes (including directories but not unchanged files being inspected):

watch lsof -p"$(pidof rsync | tr ' ' ',')"

(The same should work for a recursive cp/mv/rm.)

Similarly, for getting the status of a transfer of a single large file, this answer attempts to read the files cp is reading/writing to give a running percentage of how much it has copied; a similar approach might work for rsync.

The details#

Read more…

Out of inodes, what now?

Posted in

When you start getting disk full messages on Linux, there's a few different reasons why that might happen:

  1. The expected. Too many large files. You can track down large directories using WinDirStat or

    du -hx --max-depth=1 | sort -h
    where the -x option tells du to not cross filesystem boundaries and the -h option to both uses human-readable sizes like 11M or 1G.

  2. Deleted files aren't actually deleted if they are still open. You can use lsof to find open files. Give it the filesystem as an argument like lsof /home.

  3. By default 5% of each filesystem is reserved for writes by root. Depending on what the filesystem is being used for, this may be too much or simply unnecessary. See this Server Fault answer for how to deal with this.

  4. The files could be shadowed by a mount. If a filesystem is mounted over a non-empty directory, the files in that directory aren't visible.

  5. Last, the disk might not actually be out of space at all. It might actually be out of inodes. Some filesystems, notably the ext2/3/4 filesystems used by default on most Linux distributions have a fixed number of inodes allocated at filesystem creation time. The default is high enough that it is unlikely to be an issue unless there are a very large number of empty files. df -i will show the number of inodes free on each filesystem to verify if a filesystem is indeed out of inodes.

    But how do you find those empty files? As described above, du will help find large files, but now we want to find large numbers of files. The following command acts like du -hx --max-depth=$depth | sort -h for inodes instead of file sizes:

    find -xdev | sed "s@\(\([^/]*/\)\{$depth\}[^/]*\).*@\1@" | uniq -c | sort -n
    

    find -xdev lists all of the files under the current directory on the same filesystem. The sed command finds the first $depth directories (ending in /) and discards the rest of the filename (the .* at the end), so each directory appears once for every file or directory anywhere under it. Then the end of the command counts the repeated lines and sorts by those counts, highlighting the directories with the most files.