A Weird Imagination

Finding broken {filename} links

Posted Sun 21 July 2024 in Blogging

bash find grep if one-liner pelican sed sh test

The problem#

I've recently been writing more series of blog posts or otherwise linking between posts using {filename} links. And also I've been adjusting the scheduling of my future planned blog posts, which involves changing the filename as my naming scheme includes the publication date in the filename. Which means there's opportunities for not adjusting the links to match and ending up with broken links between posts.

Pelican does generate warnings like

WARNING  Unable to find './invalid.md', skipping url        log.py:89
         replacement.

but currently building my entire blog takes about a minute, so I generally only do it when publishing. So I wanted a more lightweight way to just check the intra-blog {filename} links.

The solution#

I wrote the script check_filename_links.sh:

#!/bin/bash

content="${1:-.}"

find "$content" -iname '*.md' -type f -print0 | 
  while IFS= read -r -d '' filename
  do
    grep '^\[.*]: {filename}' "$filename" |
      sed 's/^[^ ]* {filename}\([^\#]*\)\#\?.*$/\1/' |
      while read -r link
      do
        if [ "${link:0:1}" != "/" ]
        then
          linkedfile="$(dirname "$filename")/$link"
        else
          linkedfile="$content$link"
        fi
        if [ ! -f "$linkedfile" ]
        then
          echo "filename=$filename, link=$link,"\
               "file does not exist: $linkedfile"
        fi
      done
  done

Run it from your content/ directory or provide the path to the content/ directory as an argument and it will print out the broken links:

filename=./foo/bar.md, link=./invalid.md, file does not exist: ./foo/./invalid.md

The details#

Recreate moves from zfs diff

Posted Sun 03 March 2024 in Linux

backups bash cut echo grep mkdir mv octal printf read sh zfs zfs diff

The problem#

When doing an incremental backup, any moved file on the source filesystem usually results in recopying the file to the destination filesystem. For a large file this can both be slow and possibly waste space if the destination keeps around deleted files (e.g. ZFS holding on to old snapshots). If both sides are ZFS, then you can get zfs send/recv to handle all of the details efficiently. But if only the source filesystem is ZFS or the ZFS datasets are not at the same granularity on both sides, that doesn't apply.

zfs diff gives the information about file moves from a snapshot, but its output format is a little awkward for scripting.

The solution#

Download the script I wrote, zfs-diff-move.sh and run it like

zfs-diff-move.sh /path/ /tank/dataset/ tank/dataset@base @new

The following is an abbreviated version of it:

#!/bin/bash
zfs diff -H "$3" "$4" | grep '^R' | while read -r line
do
  get_path() {
    path="$(echo -e "$(echo "$line" | cut -d$'\t' "-f$3")")"
    echo "${path/#$2/$1}"
  }

  from="$(get_path "$1" "$2" 2)"
  to="$(get_path "$1" "$2" 3)"
  mkdir -vp -- "$(dirname "$to")"
  mv -vn -- "$from" "$to" || echo "Unable to move $from"
done

The details#

Generating specialized word lists

Posted Sun 06 September 2020 in Linux

awk games grep join paste sed sh sort tr uniq word games

The problem#

I've been playing Codenames online a lot lately (using my fork of codenames.plus), and a friend suggested it might be fun to have themed word lists. Specifically, they suggested Star Trek as a theme as it's a fandom that's fairly widely known. They left it up to me to figure out what should be in a Star Trek themed word list.

The solution#

If you just want to play Codenames with the list, go to my Codenames web app and select one or both of the Star Trek card packs. If you just want the word lists, you can download the Star Trek: The Next Generation words and the Star Trek: Deep Space 9 words.

To generate a word list yourself (I used this source for the Star Trek scripts), you will need a common words list like en_50k.txt which I mentioned in my previous post on anagram games, and then pipe the corpus through the following script (which you will likely have to modify for the idiosyncrasies of your data):

#!/bin/bash
set -euo pipefail

NUM_COMMON=2000 # Filter out the most common 2000 words
COMMON_WORDS="$(mktemp)"
<en_50k.txt head "-$NUM_COMMON" | cut -d' ' -f1 |\
    sort | tr '[:lower:]' '[:upper:]' >"$COMMON_WORDS"

# Select only dialogue lines (in Star Trek scripts)
grep -aP '^\t\t\t[^\t]' |\
    # Split words
    tr ' .,:()\[\]!?;"/\t[:cntrl:]' '[\n*]' |\
    sed 's/--/\n/' |\
    # Strip whitespace
    sed 's/^\s\+//' | sed 's/\s\+$//' |\
    grep -av '^\s*$' |\
    # Strip quotes
    sed "s/^'//" | sed "s/'$//" |\
    # Filter out numbers
    grep -av '^[[:digit:]]*$' |\
    tr '[:lower:]' '[:upper:]' |\
    # Fix for contractions not being in wordlist
    sed "s/'\(S\|RE\|VE\|LL\|M\|D\)$//" |\
    grep -av "'T$" |\
    # Remove some more non-words
    grep -avF '-' |\
    grep -avF '&' |\
    # Count
    sort | uniq -c |\
    # Only keep words with >25 occurrences
    awk '{ if ($1 > 25) { print } }' |\
    # Remove common words
    join -v2 -22 -o 2.1,2.2 "$COMMON_WORDS" - |\
    # Sort most common words first
    sort -rn

rm "$COMMON_WORDS"

The output of the script will require some manual effort to decide which words really belong in the final list, but it's a good start.

The details#

Nvidia GLX not working

Posted Sun 09 December 2018 in Linux

debian debian alternatives glxgears grep linux locate nvidia readlink troubleshooting wajig x11 xorg.conf

The problem#

I recently replaced my old Nvidia graphics card with a newer one. Upon booting up, I ran glxgears to test that 3D graphics were working properly and got an error like

X Error of failed request:  BadWindow (invalid Window parameter)
 Major opcode of failed request:  155 (NV-GLX)
 Minor opcode of failed request:  4 ()
 Resource id in failed request:  0x1200003
 Serial number of failed request:  34
 Current serial number in output stream:  34

The solution#

Either delete /etc/X11/xorg.conf or edit it and remove (or comment out) the "Files" section; that is, the lines

Section "Files"
    ...
EndSection

Volume via shell

Posted Sun 03 May 2015 in Linux

alsa alsamixer amixer grep pactl pulseaudio

The problem#

Sometimes a GUI is not the best way to control a computer's volume. Usually if you care about the volume of your computer, you're probably nearby but perhaps would rather be using a remote or other shortcut way of changing the volume. The specific use case that prompted this blog post was binding the volume up and volume down keys on my keyboard to the global volume control (as opposed to separately binding them in each application).

Shadowrun's text compression

Posted Sun 05 April 2015 in ROM hacking

65xx endianness grep reverse engineering rom hacking sed shadowrun snes sort uniq

The problem#

Several years ago, I was in a ROM hacking IRC room where another regular Alchemic was reverse engineering the text system of the SNES game Shadowrun. He figured it out and wrote a python script to decompress the text but had some questions about why it was designed the way it was. So we're going to walk through figuring out how the code works, with some help from his notes, and try to understand the design.

If you don't want spoilers and would rather try to reverse engineer it yourself, just read up to the end of the Trace format section and see how much you can figure out on your own.

Read-only filesystem errors

Posted Fri 03 April 2015 in Linux

error messages grep linux mount readonly mount rtorrent touch

Linux has a tendency to give very unhelpful error messages when it is unable to create a file. I previously blogged about a few different reasons Linux might report a disk is full, but all of the reasons included the disk actually not having space for more files. Yet another reason to get similar errors is if the partition is mounted readonly (ro):

$ mount | grep -F /usr
/dev/sdc2 on /usr type ext4 (ro,nodev,noatime,data=ordered)

mount without any options lists all of the mounted partitions along with their mount options.

Many programs will show a helpful error message:

$ touch test
touch: cannot touch ‘test’: Read-only file system

But some others won't:

rtorrent: Could not lock session directory: "./session/", held by "<error>".

That error is normally caused by ./session/rtorrent.lock not being writable due to being held by another process, but in this case it's not writable due to the filesystem being readonly. rtorrent doesn't distinguish the two.

For that reason, when running into weird behavior from a program on Linux, it's a good idea to check that the directories the program might try to write to are actually writable.

Listing files into a file

Posted Sun 15 March 2015 in Linux

echo grep linux ls pgrep pidof ps redirection sh tee

The problem#

$ ls > file

doesn't do what you expect:

$ touch foo
$ touch bar
$ ls > filelist
$ cat filelist
bar
filelist
foo

You probably didn't expect, or want, filelist to be listed in filelist.

The solution#

$ filelist=$(ls); echo "$filelist" >filelist

Pi in shell

Posted Sat 14 March 2015 in Linux

banner bc c curl grep ioccc pi printerbanner python split tr

Calculating π the hard way#

In honor of Pi Day, I was going to try to write a script that computed π in shell, but given the lack of floating point support, I decided it would be too messy. If you want to see hard to follow code to generate π, I highly recommend the IOCCC entry westley.c from 1998, the majority of which is an ASCII art circle which calculates its own area and radius in order to estimate π. The hint file suggests looking at the output of

$ cc -E westley.c

The 2012 entry, endoh2 is also a pretty amazing π calculator.

Getting π#

Instead, I will just generate π the shell way: using another program.

$ python -c 'import math; print(math.pi)'
3.14159265359

Monitor all the things

Posted Fri 06 March 2015 in Linux

gpgpu grep htop iftop iotop latencytop lspci mkdir nload powertop top uname wget

CPU and memory#

On Linux, the basic way to monitor load is to use top. The only thing top really has going for it is that it is almost certainly available on any system you will ever use. Luckily, there's a better way: htop. htop supports colors and mouse clicks and lists the available key commands at the bottom of the terminal. It also can be customized to your liking. You can start by putting my htoprc in your ~/.config/htop/ directory:

$ mkdir -p ~/.config/htop/
$ cd ~/.config/htop/
$ wget https://gist.githubusercontent.com/dperelman/1e051f5705685cb41f31/raw/3ab9cf17b166120a805d5f76a71ce82452f553b4/htoprc

Or just explore the options yourself.

Hit F1 (or click Help in the bottom-left) to get an explanation of the colors used in the CPU and memory bars and a guide to keystrokes not listed at the bottom.

In my usage, I find insufficient memory is more often the problem than CPU, so I usually leave htop sorted by the MEM% column.

Other resources#

While CPU and memory are the easiest to monitor resources, they are not the only ones. Linux offers a wide variety of system monitors, depending on what resource you want to monitor and what format you want to view it in. This post focuses on real-time viewing with human-friendly displays but most of these have options or variants that support logging historical data in a more machine-friendly format as well.