A Weird Imagination

Linting Markdown reference-style links

The problem#

When writing blog posts, I like to use Markdown's reference-style links which let you avoid writing URLs inline and instead provide a short name and define it elsewhere in the document. I always put them at the end, which results in the bottom of the Markdown file looking like a bibliography for the post. But then there's the extra task of making sure the references at the bottom of the post are consistent with their usage in the blog post; this isn't a huge problem as usually I add a link and immediately use it, looking at the preview to make sure it got used properly. But sometimes I'll start a post by entering a list of links I expect to use, and sometimes I'll miss something checking the preview.

The solution#

lint_refs.py takes any number of Markdown files as arguments and prints the references that are invalid or not used:

#!/usr/bin/env python

import sys
from pathlib import Path
from markdown import markdown, extensions, postprocessors


class ReferenceProxy(dict):
    def __init__(self, *arg, **kw):
        super(ReferenceProxy, self).__init__(*arg, **kw)
        self.read = set()

    def __contains__(self, key):
        self.read.add(key)
        return super().__contains__(key)


class ReferenceLintExtension(extensions.Extension,
                             postprocessors.Postprocessor):
    def __init__(self, filename):
        self.filename = filename

    def extendMarkdown(self, md):
        self.refs = md.references = \
                ReferenceProxy(**md.references)
        md.postprocessors.register(self, 'ref_lint', 1)

    def run(self, text):
        undefined = self.refs.read - set(self.refs.keys())
        unused = set(self.refs.keys()) - self.refs.read
        if unused or undefined:
            print(f"\n\n# {self.filename}")
            if undefined:
                print("\n## UNDEFINED REFERENCES")
                print('\n'.join(sorted(undefined)))
            if unused:
                print("\n## UNUSED REFERENCES")
                print('\n'.join(sorted(unused)))
        return text


for filename in sys.argv[1:]:
    markdown(Path(filename).read_text(), extensions=[
        ReferenceLintExtension(filename),
        'markdown.extensions.extra'])

Given example.md:

A [broken link][broken]. A [working link][working].

[working]: https://example.com/
[not-used]: https://example.org/
$ ./lint-refs.py example.md 


# example.md

## UNDEFINED REFERENCES
broken
broken link

## UNUSED REFERENCES
not-used

The details#

Read more…