A Weird Imagination

Generating specialized word lists

Posted in

The problem

I've been playing Codenames online a lot lately (using my fork of codenames.plus), and a friend suggested it might be fun to have themed word lists. Specifically, they suggested Star Trek as a theme as it's a fandom that's fairly widely known. They left it up to me to figure out what should be in a Star Trek themed word list.

The solution

If you just want to play Codenames with the list, go to my Codenames web app and select one or both of the Star Trek card packs. If you just want the word lists, you can download the Star Trek: The Next Generation words and the Star Trek: Deep Space 9 words.

To generate a word list yourself (I used this source for the Star Trek scripts), you will need a common words list like en_50k.txt which I mentioned in my previous post on anagram games, and then pipe the corpus through the following script (which you will likely have to modify for the idiosyncrasies of your data):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#!/bin/bash
set -euo pipefail

NUM_COMMON=2000 # Filter out the most common 2000 words
COMMON_WORDS="$(mktemp)"
<en_50k.txt head "-$NUM_COMMON" | cut -d' ' -f1 |\
    sort | tr '[:lower:]' '[:upper:]' >"$COMMON_WORDS"

# Select only dialogue lines (in Star Trek scripts)
grep -aP '^\t\t\t[^\t]' |\
    # Split words
    tr ' .,:()\[\]!?;"/\t[:cntrl:]' '[\n*]' |\
    sed 's/--/\n/' |\
    # Strip whitespace
    sed 's/^\s\+//' | sed 's/\s\+$//' |\
    grep -av '^\s*$' |\
    # Strip quotes
    sed "s/^'//" | sed "s/'$//" |\
    # Filter out numbers
    grep -av '^[[:digit:]]*$' |\
    tr '[:lower:]' '[:upper:]' |\
    # Fix for contractions not being in wordlist
    sed "s/'\(S\|RE\|VE\|LL\|M\|D\)$//" |\
    grep -av "'T$" |\
    # Remove some more non-words
    grep -avF '-' |\
    grep -avF '&' |\
    # Count
    sort | uniq -c |\
    # Only keep words with >25 occurrences
    awk '{ if ($1 > 25) { print } }' |\
    # Remove common words
    join -v2 -22 -o 2.1,2.2 "$COMMON_WORDS" - |\
    # Sort most common words first
    sort -rn

rm "$COMMON_WORDS"

The output of the script will require some manual effort to decide which words really belong in the final list, but it's a good start.

The details

Read more…

Nvidia GLX not working

The problem

I recently replaced my old Nvidia graphics card with a newer one. Upon booting up, I ran glxgears to test that 3D graphics were working properly and got an error like

X Error of failed request:  BadWindow (invalid Window parameter)
 Major opcode of failed request:  155 (NV-GLX)
 Minor opcode of failed request:  4 ()
 Resource id in failed request:  0x1200003
 Serial number of failed request:  34
 Current serial number in output stream:  34

The solution

Either delete /etc/X11/xorg.conf or edit it and remove (or comment out) the "Files" section; that is, the lines

Section "Files"
    ...
EndSection

Read more…

Volume via shell

Posted in

The problem

Sometimes a GUI is not the best way to control a computer's volume. Usually if you care about the volume of your computer, you're probably nearby but perhaps would rather be using a remote or other shortcut way of changing the volume. The specific use case that prompted this blog post was binding the volume up and volume down keys on my keyboard to the global volume control (as opposed to separately binding them in each application).

Read more…

Shadowrun's text compression

The problem

Several years ago, I was in a ROM hacking IRC room where another regular Alchemic was reverse engineering the text system of the SNES game Shadowrun. He figured it out and wrote a python script to decompress the text but had some questions about why it was designed the way it was. So we're going to walk through figuring out how the code works, with some help from his notes, and try to understand the design.

If you don't want spoilers and would rather try to reverse engineer it yourself, just read up to the end of the Trace format section and see how much you can figure out on your own.

Read more…

Read-only filesystem errors

Linux has a tendency to give very unhelpful error messages when it is unable to create a file. I previously blogged about a few different reasons Linux might report a disk is full, but all of the reasons included the disk actually not having space for more files. Yet another reason to get similar errors is if the partition is mounted readonly (ro):

$ mount | grep -F /usr
/dev/sdc2 on /usr type ext4 (ro,nodev,noatime,data=ordered)

mount without any options lists all of the mounted partitions along with their mount options.

Many programs will show a helpful error message:

$ touch test
touch: cannot touch ‘test’: Read-only file system

But some others won't:

rtorrent: Could not lock session directory: "./session/", held by "<error>".

That error is normally caused by ./session/rtorrent.lock not being writable due to being held by another process, but in this case it's not writable due to the filesystem being readonly. rtorrent doesn't distinguish the two.

For that reason, when running into weird behavior from a program on Linux, it's a good idea to check that the directories the program might try to write to are actually writable.

Listing files into a file

Posted in

The problem

$ ls > file

doesn't do what you expect:

$ touch foo
$ touch bar
$ ls > filelist
$ cat filelist
bar
filelist
foo

You probably didn't expect, or want, filelist to be listed in filelist.

The solution

$ filelist=$(ls); echo "$filelist" >filelist

Read more…

Pi in shell

Posted in

Calculating π the hard way

In honor of Pi Day, I was going to try to write a script that computed π in shell, but given the lack of floating point support, I decided it would be too messy. If you want to see hard to follow code to generate π, I highly recommend the IOCCC entry westley.c from 1998, the majority of which is an ASCII art circle which calculates its own area and radius in order to estimate π. The hint file suggests looking at the output of

$ cc -E westley.c

The 2012 entry, endoh2 is also a pretty amazing π calculator.

Getting π

Instead, I will just generate π the shell way: using another program.

$ python -c 'import math; print(math.pi)'
3.14159265359

Read more…

Monitor all the things

CPU and memory

On Linux, the basic way to monitor load is to use top. The only thing top really has going for it is that it is almost certainly available on any system you will ever use. Luckily, there's a better way: htop. htop supports colors and mouse clicks and lists the available key commands at the bottom of the terminal. It also can be customized to your liking. You can start by putting my htoprc in your ~/.config/htop/ directory:

$ mkdir -p ~/.config/htop/
$ cd ~/.config/htop/
$ wget https://gist.githubusercontent.com/dperelman/1e051f5705685cb41f31/raw/3ab9cf17b166120a805d5f76a71ce82452f553b4/htoprc

Or just explore the options yourself.

Hit F1 (or click Help in the bottom-left) to get an explanation of the colors used in the CPU and memory bars and a guide to keystrokes not listed at the bottom.

In my usage, I find insufficient memory is more often the problem than CPU, so I usually leave htop sorted by the MEM% column.

Other resources

While CPU and memory are the easiest to monitor resources, they are not the only ones. Linux offers a wide variety of system monitors, depending on what resource you want to monitor and what format you want to view it in. This post focuses on real-time viewing with human-friendly displays but most of these have options or variants that support logging historical data in a more machine-friendly format as well.

Read more…

Changing Pelican URL scheme

Posted in

The problem

I changed the URI scheme of this blog recently from /posts/YYYY/MM/slug/ to /YYYY/MM/DD/slug/. The latter looks better and makes the actual day of the post more visible.

But I already had posts using the old scheme and cool URIs don't change. Luckily, someone wrote a Pelican plugin called pelican-alias which allows articles to be tagged with additional URIs to redirect to their canonical location. All I had to do was add an Alias: /posts/2015/02/... line to the top of each of the posts I had already written and the plugin would take care of the rest.

Automating the aliasing

The non-trivial part of automating this is that the URIs include the article's slug, which may have been generated by Pelican from the title, so Pelican has to be involved in generating the correct redirects.

There are two ways I could have automated this process:

  1. Modify the plugin to add a redirect from the old scheme to the new scheme for every article. Unless somehow controlled, this would result in creating redirects for new articles which do not need them.
  2. Write a one-off script to get the slugs out of Pelican and write the Alias: lines into the blog posts.

I took the latter approach because it was simpler and involved no new code to maintain.

Read more…

Checking for unsafe shell constructs

Posted in

Filenames are troublesome

While shell programing lets you write very concise programs, it turns out that the primary use case of working with files is unfortunately much harder than it seems. That detailed article by David A. Wheeler does a good job of explaining all of the various problems that a naive shell script can run into due to various characters which are allowed in filenames which the shell treats specially in some way.

One surprising one is that filenames beginning with a dash (-) can be interpreted as options due to the way globbing works in the shell. Suppose we set up a directory as follows:

$ cat > -n
Some secret text.
$ cat > test
This is a test.
It has multiple lines.

Quick, what will cat * do here?

$ cat *
     1  This is a test.
     2  It has multiple lines.

Probably not what you wanted. The reason that happens is that the * is expanded by the shell before being fed to cat, so the command executed is cat -n test and -n gets interpreted not as a filename but as an option telling cat to number the lines of the output.

The workaround is to use ./* instead of *, so the - will not actually be the first character and therefore will not get misinterpreted as an option. But there are many other things that can go wrong with unexpected filenames and remembering to handle all of them everywhere is error-prone.

Warnings for unsafe shell code

The solution is shellcheck. shellcheck will warn you about mistakes like the cat * problem and many other issues you may not be aware of.

As I have many shellscripts around that I wrote before learning about shellcheck, I wanted to run it on all of the shell scripts (but not binaries or other language scripts) in my ~/bin directory, so naturally I wrote a script to do so:

#!/bin/sh

find -exec file {} \; \
    | grep -F 'shell script' \
    | sed s/:[^:]*$// \
    | xargs shellcheck

This uses the file command to identify shell scripts and then selects out their file names to run shellcheck on all of them using xargs.

Warnings in Vim

shellcheck is written to support integration into IDEs. I use Vim to edit shell scripts, so I installed the syntastic (using Vundle which makes installing Vim plugins off GitHub very easy). Note to follow the instructions on the Syntastic page, specifically the recommended settings: without any settings it won't do anything at all. Once set up, it automatically runs shellcheck on every save, identifies lines with warnings and shows a list of warnings that can be double-clicked to jump to the location of the warning.

If you use the other text editor, then the shellcheck website recommends the flycheck plugin.