A Weird Imagination

Ordering saves by date

Posted in

The problem#

Last week, I shared a script which continuously backed up game saves whenever the game saved. The result is a series of directories that contain snapshots of the game saves from every autosave. But to view this data, we really want a list of unique files in order marked with the time they were created.

The solution#

The following will create symbolic links to the unique files named after their modification date:

for i in /tank/factorio/.zfs/snapshot/*/*.zip
do
  ln -sf "$i" "$(stat --printf=%y "$i").zip"
done

or if you want a custom date format, you can use date:

  ln -sf "$i" "$(date -r "$i" +%Y-%m-%d_%H-%M-%S).zip"

Alternatively, the following will just list the unique files with their timestamps:

find /tank/factorio/.zfs/snapshot/ -printf "%T+ %p\n" \
    | sort | uniq --check-chars=30

The details#

Unique dates#

Due to how the files were created, all of the files are essentially different versions of the same file distinguished only by their timestamp. Most of them will have names like _autosave1.zip that we don't care about. Furthermore, they are uniquely identified by their timestamp; that is, any two files with the same timestamp are the same file. We don't have to bother with comparing them to check. In fact, they should literally hardlinked to each other, so we could compare their inode numbers instead of timestamps:

find /tank/factorio/.zfs/snapshot/ -printf "%9i %p\n" \
    | sort -n | uniq --check-chars=10

But it's useful to identify the saves by the time they were created, and the inode numbers do not convey any useful information.

That leads to the simple algorithm above: just iterate all of the files and symlink them to a filename contain just their modification date. There will be collisions, but we don't care which file wins when there's a collision because they're indistinguishable. stat and date both provide a way to get a human-readable (and sortable) string of the modification date of a file.

Sorting result of find#

This StackExchange answer gave the method for sorting the results of find by modification date: it prints the timestamp in a sortable format before each filename and passes that list to sort. Then to get just a single file for each modification date, I observed the dates were always 30 characters long, so I used the --check-chars option to uniq to only compare the dates when deciding if lines were unique.

Comments

Have something to add? Post a comment by sending an email to comments@aweirdimagination.net. You may use Markdown for formatting.

There are no comments yet.