The problem#
My system for automatically posting future-dated blog posts mysteriously stopped working recently. The posts would appear if I manually published the blog, but not with the automatic scheduling mechanism.
The solution#
In schedule_publish.sh
, I changed the line
echo "$0" | at -q g $time
to
if [ "$(date -d "$time PST" +'%s')" -ge "$now" ]
then
echo "$0" | at -q g -t "$(date +'%Y%m%d%H%M' -d "$time PST")"
fi
(where "PST
" is the timezone of this blog; adjust as appropriate for
your blog).
$now
is initialized with
now="$(date +'%s')"
before the call to make publish
to avoid a race condition.
The details#
Getting a log#
at
by default sends a log of the output to
mail
:
$ mail
Mail version 8.1.2 01/15/2001. Type ? for help.
"/var/mail/anyoneeb": 235 messages 235 new
>N 1 anyoneeb@host Sun Dec 30 02:00 20/844 Output from your job
& 1
Message 1:
From anyoneeb@host Sun Dec 30 02:00:15 2018
Envelope-to: anyoneeb@host
Delivery-date: Sun, 30 Dec 2018 02:00:15 -0500
Subject: Output from your job 21
To: anyoneeb@host
From: anyoneeb@host
Date: Sun, 30 Dec 2018 02:00:15 -0500
pelican /home/anyoneeb/sites/aweirdimagination/deploy/content -o /home/anyoneeb/sites/aweirdimagination/deploy/output -s /home/anyoneeb/sites/aweirdimagination/deploy/publishconf.py
Processed 1 comment(s)
Done: Processed 57 articles, 1 drafts, 2 pages and 0 hidden pages in 14.23 seconds.
at: refusing to create job destined in the past
warning: commands will be executed using /bin/sh
job 9 at Sun Dec 30 02:00:00 2018
This log is saying that at 2:00, as expected, the publish script
ran successfully… but then found the post scheduled for 2:00 in the drafts
folder and tried to schedule it for 2:00, which failed because at
output
at: refusing to create job destined in the past
So, 2:00 is both in the future (because the post wasn't published) and in the
past (because at
can't schedule publishing it). The key is in the date
line that says
Date: Sun, 30 Dec 2018 02:00:15 -0500
even though the timezone this blog is in is -0800
, not -0500
.
What was happening is that the publish was getting scheduled for 02:00-0500
which is 23:00-0800
the previous night in the timezone of the blog: 3 hours
before the post is due to be published, so of course the post isn't getting
published.
Now, one fix would have been to just change the timezone of the server, but it seems broken that the server timezone matters to the blog.
Timezone for at
#
First I tried to provide the timezone to at
directly, but it
does not appear to support that (or, at least, the man
page doesn't
mention any such support). Then I found a suggestion for
using date
to convert between timezones, since
date
will accept a timezone in its -d
argument
(example from the linked page):
$ date -d '2014-06-26 23:00 CEST'
Fri Jun 27 07:00:00 EST 2014
Then I just had to figure out how to get date
to output in a format
at
would accept. at
's -t
argument accepts a time in a specific
format ([[CC]YY]MMDDhhmm[.ss]
), so I used that along with date
's
+
argument to control its output format:
$ date -d '02:00 2018-12-30 PST'
Sun Dec 30 05:00:00 EST 2018
$ date +"%Y%m%d%H%M" -d '02:00 2018-12-30 PST'
201812300500
Better timezone names#
As a minor modification, the documentation for date
points out that the full timezone name can be given using TZ=
syntax,
so instead of putting PST
in the date, instead it will accept any
tz database timezone:
$ date +"%Y%m%d%H%M" -d 'TZ="America/Los_Angeles" 02:00 2018-12-30'
201812300500
This is the same format as the timezone in pelicanconf.py
, so we can
load that value:
timezone="$(python -c 'from pelicanconf import TIMEZONE; print(TIMEZONE)')"
Scheduling only future jobs#
According to at
's man
page,
If you specify a job to absolutely run at a specific time and date in the past, the job will run as soon as possible.
This wasn't happening before, but perhaps the -t
argument works differently.
As the mechanism for scheduling posts is to mark them as drafts if publishing before the date on the post, it will also schedule publishing at the publication dates of any actual drafts, if they exist. Such dates may be in the past, which is now taken as meaning "as soon as possible", resulting in continuously republishing the blog, since those drafts will never leave the drafts directory.
The fix is to only schedule publishing drafts whose publication date is after the last time the blog has been published. First, we need to know what that date is, so the top of the script is changed to
now="$(date +'%s')"
make publish
to record the time before publishing in $now
in
Unix epoch time, so
[
can compare dates as numbers:
if [ "$(date -d "$time PST" +'%s')" -ge "$now" ]
then
…
fi
Note we record the time before make publish
to avoid a
race condition where the blog is published,
the time passes the scheduled time to publish a post, and then
the script gets to the part about checking for when to publish
at which point it sees all of the times are in the past and
doesn't schedule to publish in the future, so the post doesn't
get published (until the next time the blog is published
manually).
Putting it all together#
The updated schedule_publish.sh
now reads:
#!/bin/sh
now="$(date +'%s')"
timezone="$(python -c 'from pelicanconf import TIMEZONE; print(TIMEZONE)')"
# Pelican publish
make publish
# Clear old queue entries if they call this script.
for q in $(atq -q g | cut -f1)
do
if [ "$(at -c "$q" | tail -2 | head -1)" = "$0" ]
then
atrm "$q"
fi
done
# Check newly published drafts for when they should be published.
# Not using for because output lines have spaces.
grep -F -- '<!-- Post at datetime ' output/drafts/* | cut -d' ' -f5-6 | while read -r time
do
# Schedule running this script for that time.
if [ "$(date -d "TZ=\"$timezone\" $time" +'%s')" -ge "$now" ]
then
echo "$0" | at -q g -t "$(date +'%Y%m%d%H%M' -d "TZ=\"$timezone\" $time")"
fi
done
Comments
Have something to add? Post a comment by sending an email to comments@aweirdimagination.net. You may use Markdown for formatting.
There are no comments yet.