The problem#
Last week, I shared a script which continuously backed up game saves whenever the game saved. The result is a series of directories that contain snapshots of the game saves from every autosave. But to view this data, we really want a list of unique files in order marked with the time they were created.
The solution#
The following will create symbolic links to the unique files named after their modification date:
for i in /tank/factorio/.zfs/snapshot/*/*.zip
do
ln -sf "$i" "$(stat --printf=%y "$i").zip"
done
or if you want a custom date format, you can use date
:
ln -sf "$i" "$(date -r "$i" +%Y-%m-%d_%H-%M-%S).zip"
Alternatively, the following will just list the unique files with their timestamps:
find /tank/factorio/.zfs/snapshot/ -printf "%T+ %p\n" \
| sort | uniq --check-chars=30
The details#
Unique dates#
Due to how the files were created, all of the files are essentially
different versions of the same file distinguished only by their
timestamp. Most of them will have names like _autosave1.zip
that we
don't care about. Furthermore, they are uniquely identified by their
timestamp; that is, any two files with the same timestamp are the same
file. We don't have to bother with comparing them to check. In fact,
they should literally hardlinked to each other, so we could compare
their inode numbers instead of timestamps:
find /tank/factorio/.zfs/snapshot/ -printf "%9i %p\n" \
| sort -n | uniq --check-chars=10
But it's useful to identify the saves by the time they were created, and the inode numbers do not convey any useful information.
That leads to the simple algorithm above: just iterate all of the files
and symlink them to a filename contain just their modification date.
There will be collisions, but we don't care which file wins when there's
a collision because they're indistinguishable. stat
and
date
both provide a way to get a human-readable (and
sortable) string of the modification date of a file.
Sorting result of find
#
This StackExchange answer gave the method for sorting
the results of find
by modification date: it prints the
timestamp in a sortable format before each filename and passes that list
to sort
. Then to get just a single file for each modification
date, I observed the dates were always 30 characters long, so I used the
--check-chars
option to uniq
to only compare the dates
when deciding if lines were unique.
Comments
Have something to add? Post a comment by sending an email to comments@aweirdimagination.net. You may use Markdown for formatting.
There are no comments yet.