A Weird Imagination

Relative links in feeds

The problem#

In an RSS/Atom feed, relative links are a bad idea because it's unclear what they're relative to. There are ways to specify a base for them to be relative to, but since feed readers do not consistently respect those mechanisms, it's safer to just always use absolute URLs in feeds. And Pelican recommends setting RELATIVE_URLS = False to always generate absolute URLs. But that setting does not apply to the anchor links generated by the Markdown toc extension to link to headers.

The solution#

I wrote a Pelican plugin, absolute_anchors which rewrites all link destinations starting with # in every article to add the absolute URL of the article at the beginning of the link.

The details#

Read more…

Pelican publish without downtime

Posted in

The problem#

My existing script for publishing my blog has Pelican run on the web server and generate the static site directly into the directory served by nginx. This has the effect that while the blog is being published, it is inaccessible or some of the pages or styles are missing. The publish takes well under a minute, so this isn't a big issue, but there's no reason for any downtime at all.

The solution#

Instead of serving the output/ directory, instead generate it and then copy it over by changing the make publish line in schedule_publish.sh to the following:

make publish || exit 1
if [ -L output_dir ]
then
    cp -r output output_dir/
    rm -rf output_dir/html.old
    mv output_dir/html output_dir/html.old
    mv output_dir/output output_dir/html
fi

where output_dir/ is a symbolic link to the parent of the directory actually being served and html/ is the directory actually being served (which output/ previously was a symbolic link to).

The details#

Read more…

Timezones and scheduling tasks with at

The problem#

My system for automatically posting future-dated blog posts mysteriously stopped working recently. The posts would appear if I manually published the blog, but not with the automatic scheduling mechanism.

The solution#

In schedule_publish.sh, I changed the line

echo "$0" | at -q g $time

to

if [ "$(date -d "$time PST" +'%s')" -ge "$now" ]
then
    echo "$0" | at -q g -t "$(date +'%Y%m%d%H%M' -d "$time PST")"
fi

(where "PST" is the timezone of this blog; adjust as appropriate for your blog). $now is initialized with

now="$(date +'%s')"

before the call to make publish to avoid a race condition.

The details#

Read more…

My first Pelican plugin

The problem#

My previous blog post has a footnote in the first sentence. Due to the way footnotes are handled, the footnote reference is a link to #fn:prg, which works fine if the footnote is actually on the page, but on the blog main page (or any other listing of multiple articles) the footnote is not present because it's after the Read more… link. The result is that on those pages, all footnote references are broken links. These broken links should either be repaired such that they point to the article page or removed.

First attempt#

Unable to find an existing solution, I decided to write my own plugin, summary_footnotes. I started by finding another plugin, clean_summary that modifies summary and based my code off of it. That plugin uses Beautiful Soup to parse the summary and rewrite it. A quick look at the docs and I was able to figure out how to select the footnote links and rewrite them, which got me this version of the plugin.

Read more…

Future-dating static blog content

Posted in

The problem#

Static site generators are great. But so are blog posts that automatically appear on schedule. How do we reconcile the two? There are solutions involving checking for updates on a schedule like every hour or every day, but that seems unsatisfying: if the posts have already been written, the blog should only need to be regenerated exactly when there is new content to publish.

The solution#

(These instructions are specifically for Pelican as that is what this blog uses, a similar method should work for other static blogging engines.)

Use Pelican's WITH_FUTURE_DATES setting to make future dated posts not appear as part of the blog, but only as drafts. Add the following to the article template in order to include the future publication dates in an easy to parse format:

{% if article.status == "draft" %}
    <!-- Post at datetime {{ article.date|strftime("%H:%M %Y-%m-%d") }} -->
{% endif %}

Then the following script schedule_publish.sh uses those comments to schedule rerunning itself using at:

#!/bin/sh

# Pelican publish
make publish

# Clear old queue entries if they call this script.
for q in `atq -q g | cut -f1`
do
    if [ `at -c $q | tail -2 | head -1` = "$0" ]
    then
        atrm $q
    fi
done

# Check newly published drafts for when they should be published.
# Not using for because output lines have spaces.
grep -F -- '<!-- Post at datetime ' output/drafts/* | cut -d' ' -f5-6 | while read time
do
    # Schedule running this script for that time.
    echo "$0" | at -q g $time
done

Last, follow the instructions in this blog post and run that script as the deployment task.

The details#

Read more…

Hello World!

Posted in

Welcome to my blog. I am Daniel Perelman; I am presently a computer science graduate student at the University of Washington.

Much of my time is spent writing programs and using various highly-configurable tools like Bash, Vim, and LaTeX. I, like many other users of these tools, find myself often performing web searches for help on how to use these tools. Thanks to StackExchange and myriad technical blogs, the answers I am looking for are often close at hand. But not always. Sometimes I end up piecing together a solution from many sources. This blog is a place for me to save others time by sharing that knowledge.