A Weird Imagination

Future-dating static blog content

Posted in

The problem#

Static site generators are great. But so are blog posts that automatically appear on schedule. How do we reconcile the two? There are solutions involving checking for updates on a schedule like every hour or every day, but that seems unsatisfying: if the posts have already been written, the blog should only need to be regenerated exactly when there is new content to publish.

The solution#

(These instructions are specifically for Pelican as that is what this blog uses, a similar method should work for other static blogging engines.)

Use Pelican's WITH_FUTURE_DATES setting to make future dated posts not appear as part of the blog, but only as drafts. Add the following to the article template in order to include the future publication dates in an easy to parse format:

{% if article.status == "draft" %}
    <!-- Post at datetime {{ article.date|strftime("%H:%M %Y-%m-%d") }} -->
{% endif %}

Then the following script schedule_publish.sh uses those comments to schedule rerunning itself using at:

#!/bin/sh

# Pelican publish
make publish

# Clear old queue entries if they call this script.
for q in `atq -q g | cut -f1`
do
    if [ `at -c $q | tail -2 | head -1` = "$0" ]
    then
        atrm $q
    fi
done

# Check newly published drafts for when they should be published.
# Not using for because output lines have spaces.
grep -F -- '<!-- Post at datetime ' output/drafts/* | cut -d' ' -f5-6 | while read time
do
    # Schedule running this script for that time.
    echo "$0" | at -q g $time
done

Last, follow the instructions in this blog post and run that script as the deployment task.

The details#

Why static?#

All of this could be avoidly merely by using a dynamic blogging engine like WordPress or so many others, but static sites are faster and simpler.1

Generating pages takes time. Caching helps, but it's better if it can be avoided entirely; static site generation can be thought of as an aggressive form of caching.

Limiting the complexity of the software running on the server means limiting the attack surface, so there are fewer security concerns to worry about.

at#

While cron allows for scheduling tasks that occur once per day or week, at schedules tasks to run once at a specific date and time. In our case, we want to schedule the schedule_publish.sh script to run at the publication times of the future-dated articles.

The times added in the comments are in a format at understands so the grep and cut commands just parse them out. Note the use of the read shell built-in to read the lines into the $time variable because the times contain a space between the date and time.

The -q g argument makes at use a different queue to separate this from other uses of at, although the choice of queue is arbitrary and could be removed.


  1. If you wish to use Pelican like this site, I recommend this blog post, but many others exist. Choose the one that's best for you. 

Comments

Have something to add? Post a comment by sending an email to comments@aweirdimagination.net. You may use Markdown for formatting.

There are no comments yet.