A Weird Imagination

Linting Markdown reference-style links

The problem#

When writing blog posts, I like to use Markdown's reference-style links which let you avoid writing URLs inline and instead provide a short name and define it elsewhere in the document. I always put them at the end, which results in the bottom of the Markdown file looking like a bibliography for the post. But then there's the extra task of making sure the references at the bottom of the post are consistent with their usage in the blog post; this isn't a huge problem as usually I add a link and immediately use it, looking at the preview to make sure it got used properly. But sometimes I'll start a post by entering a list of links I expect to use, and sometimes I'll miss something checking the preview.

The solution#

lint_refs.py takes any number of Markdown files as arguments and prints the references that are invalid or not used:

#!/usr/bin/env python

import sys
from pathlib import Path
from markdown import markdown, extensions, postprocessors


class ReferenceProxy(dict):
    def __init__(self, *arg, **kw):
        super(ReferenceProxy, self).__init__(*arg, **kw)
        self.read = set()

    def __contains__(self, key):
        self.read.add(key)
        return super().__contains__(key)


class ReferenceLintExtension(extensions.Extension,
                             postprocessors.Postprocessor):
    def __init__(self, filename):
        self.filename = filename

    def extendMarkdown(self, md):
        self.refs = md.references = \
                ReferenceProxy(**md.references)
        md.postprocessors.register(self, 'ref_lint', 1)

    def run(self, text):
        undefined = self.refs.read - set(self.refs.keys())
        unused = set(self.refs.keys()) - self.refs.read
        if unused or undefined:
            print(f"\n\n# {self.filename}")
            if undefined:
                print("\n## UNDEFINED REFERENCES")
                print('\n'.join(sorted(undefined)))
            if unused:
                print("\n## UNUSED REFERENCES")
                print('\n'.join(sorted(unused)))
        return text


for filename in sys.argv[1:]:
    markdown(Path(filename).read_text(), extensions=[
        ReferenceLintExtension(filename),
        'markdown.extensions.extra'])

Given example.md:

A [broken link][broken]. A [working link][working].

[working]: https://example.com/
[not-used]: https://example.org/
$ ./lint-refs.py example.md 


# example.md

## UNDEFINED REFERENCES
broken
broken link

## UNUSED REFERENCES
not-used

The details#

Read more…

Experimenting with ZFS

Posted in

The problem#

For my recent posts on ZFS, I wanted to quickly try out a bunch of variants of my proposed operations without worrying about accidentally modifying my real ZFS filesystems. Specifically, I wanted to know which ways of copying files would result in more efficiently reusing blocks from existing snapshots where possible.

The solution#

WARNING: The instructions below will modify the ZFS pool tank, which is the default name used in many ZFS examples, and therefore may be a real ZFS pool on your computer.

I strongly recommend doing all of this inside a VM to be sure you are not affecting any real filesystems. I used a VirtualBox VM that I installed Debian on and used the guest additions to share a directory between the VM and my actual machine.

First create a 1 GiB virtual (i.e. in a file instead of a physical device) ZFS pool to run tests on:

fallocate -l 1G /root/tank
zpool create tank /root/tank

Then perform various filesystem operations and inspect the result of zfs list -o space to determine if they were using more (or less) space than you expect. In order to make sure I was being consistent and make it easier to test out multiple variations, I wrote some scripts:

git clone https://git.aweirdimagination.net/perelman/zfs-test.git
cd zfs-test/bin
# dump logs from create-/copy-all- and-measure into ../logs/
./measure-all
# read ../logs/ and print space used as Markdown table
./logs-to-table --links
Create script orig rsync-ahvx rsync-ahvx-sparse rsync-inplace rsync-inplace-no-whole-file rsync-no-whole-file zfs-diff-move-then-rsync
empty 24K 24K✅ 24K✅ 24K✅ 24K✅ 24K✅ 24K✅
random-1M-file 1.03M 1.03M✅ 1.03M✅ 1.03M✅ 1.03M✅ 1.03M✅ 1.03M✅
zeros-1M-file 24K 1.03M❌ 24K✅ 1.03M❌ 1.03M❌ 1.03M❌ 1.03M❌
move-file 1.04M 2.04M❌ 2.04M❌ 2.04M❌ 2.04M❌ 2.04M❌ 1.04M✅
edit-part-of-file 1.16M 2.04M❌ 2.04M❌ 2.04M❌ 1.17M✅ 2.04M❌ 1.17M✅

The details#

Read more…

LaTeX table environment in Madoko

Posted in

What is Madoko?#

Madoko is an extension of Markdown for scholarly papers. In essence, it is a competitor to LaTeX, which, along with Microsoft Word, is the way the vast majority of such papers are presently authored.

Madoko targets both HTML and LaTeX output in order to be compatible with existing workflows while encouraging the creation of HTML versions of papers which are presently rare as PDF is the default for publishing even though it is sub-optimal for reading on a screen.

Read more…