A Weird Imagination

Generating specialized word lists

Posted in

The problem#

I've been playing Codenames online a lot lately (using my fork of codenames.plus), and a friend suggested it might be fun to have themed word lists. Specifically, they suggested Star Trek as a theme as it's a fandom that's fairly widely known. They left it up to me to figure out what should be in a Star Trek themed word list.

The solution#

If you just want to play Codenames with the list, go to my Codenames web app and select one or both of the Star Trek card packs. If you just want the word lists, you can download the Star Trek: The Next Generation words and the Star Trek: Deep Space 9 words.

To generate a word list yourself (I used this source for the Star Trek scripts), you will need a common words list like en_50k.txt which I mentioned in my previous post on anagram games, and then pipe the corpus through the following script (which you will likely have to modify for the idiosyncrasies of your data):

#!/bin/bash
set -euo pipefail

NUM_COMMON=2000 # Filter out the most common 2000 words
COMMON_WORDS="$(mktemp)"
<en_50k.txt head "-$NUM_COMMON" | cut -d' ' -f1 |\
    sort | tr '[:lower:]' '[:upper:]' >"$COMMON_WORDS"

# Select only dialogue lines (in Star Trek scripts)
grep -aP '^\t\t\t[^\t]' |\
    # Split words
    tr ' .,:()\[\]!?;"/\t[:cntrl:]' '[\n*]' |\
    sed 's/--/\n/' |\
    # Strip whitespace
    sed 's/^\s\+//' | sed 's/\s\+$//' |\
    grep -av '^\s*$' |\
    # Strip quotes
    sed "s/^'//" | sed "s/'$//" |\
    # Filter out numbers
    grep -av '^[[:digit:]]*$' |\
    tr '[:lower:]' '[:upper:]' |\
    # Fix for contractions not being in wordlist
    sed "s/'\(S\|RE\|VE\|LL\|M\|D\)$//" |\
    grep -av "'T$" |\
    # Remove some more non-words
    grep -avF '-' |\
    grep -avF '&' |\
    # Count
    sort | uniq -c |\
    # Only keep words with >25 occurrences
    awk '{ if ($1 > 25) { print } }' |\
    # Remove common words
    join -v2 -22 -o 2.1,2.2 "$COMMON_WORDS" - |\
    # Sort most common words first
    sort -rn

rm "$COMMON_WORDS"

The output of the script will require some manual effort to decide which words really belong in the final list, but it's a good start.

The details#

Read more…

Devlog: Anagram Bagels: Part 2

There were two non-trivial aspects of the design of Anagram Bagels: puzzle generation, which I discussed in my last post, and how to handle saving and sharing puzzles, which I will discuss in this post. I wanted an intuitive design that satisfied the following constraints:

  1. It should be possible to easily share a puzzle with another person in the form of a link.

  2. The difference between a link to the game and a link to a specific puzzle should be clear. (So the user doesn't accidentally bookmark a link to a specific puzzle when meaning to bookmark the game.)

  3. The game should gracefully handle the common mobile browser behavior of reloading the page if it hasn't been viewed in a while.

  4. Opening multiple instances of the game in separate tabs shouldn't break anything. (This is the default for web sites, so it's true unless doing something to actively break this assumption.)

Read more…

Devlog: Anagram Bagels: Part 1

Introduction#

I have a friend who plays a lot of simple puzzle games on their phone. One of them is this word puzzle, which is variant of Bagels (also known as Bulls and Cows or by the trademarked name Mastermind) where the secret is an English word and the guesses must be valid words. Additionally, the alphabet of the guesses is limited to a set selected for the puzzle, and the feedback is given for specific letters as opposed to giving just a count of the correct letters.

While playing the game, my friend would often find that it would be useful to type letters out of order. For example, once determing that the word ends in "ing", it would be easier to simply write that in at the end and then fill out the beginning. As the feedback means the player often knows exactly what they want to write in the middle of the word, typing each word in order from the start to end can be awkward.

As the game seems quite simple, I decided to reimplement it and improve upon the UI. My implementation is in HTML5/JavaScript and should work in any modern browser. Play Anagram Bagels or view the source.

Read more…