I have a friend who plays a lot of simple puzzle games on their phone. One of them is this word puzzle, which is variant of Bagels (also known as Bulls and Cows or by the trademarked name Mastermind) where the secret is an English word and the guesses must be valid words. Additionally, the alphabet of the guesses is limited to a set selected for the puzzle, and the feedback is given for specific letters as opposed to giving just a count of the correct letters.
While playing the game, my friend would often find that it would be useful to type letters out of order. For example, once determing that the word ends in "ing", it would be easier to simply write that in at the end and then fill out the beginning. As the feedback means the player often knows exactly what they want to write in the middle of the word, typing each word in order from the start to end can be awkward.
no frameworks. As my target was modern phone browsers, I did
The one polyfill the code does currently include is
Array.flatMap(), which MDN has labeled as
"experimental". I'll likely add more polyfills as people point out
browsers where it doesn't work.
While implementing the game rules and basic interface was fairly straightforward, how to generate puzzles is less obvious. Or, more specifically, how to estimate the difficulty of a generated puzzle is non-obvious.
What makes a good Anagram Bagels puzzle?
A puzzle consists of the secret word and a set of letters which may be used in guesses. In order to be a good puzzle, the word must be one the player could possibly come up with (that is, it can't be so obscure or archaic that the player does not even know it's a word), and it must also be possible for the player to come up with other words that use the letters in order to get hints. Playing the game, I've also observed that uncomon letter sequences also tend to be more difficult to guess, for example compound words like "weekday" can be tricky because few words contain the letter sequence "kd".
Puzzle difficulty metrics
In order to ensure the secret words are words the player is likely to know, they are selected from a list of the 50,000 most frequent English words from the FrequencyWords data which is based on a corpus of subtitles. In order to avoid puzzles feeling too easy, the secret word is never one of the top thousand words.
One idea I had for measuring the difficulty of a puzzle is how many common words are valid guesses (i.e. English words of the correct length that use the provided letter set). The idea is that the more words that are valid guesses, the more likely the player will come up with one and get a useful hint. (As a refinement on that, the word list also has frequency data which we could use to weight more common valid guesses more strongly.) The current implementation of the game has an option to control the range of "# of matches in top 50k words"; the puzzle generation algorithm reselects the set of random letters until it finds a set that matches the requested range (or it gives up). Optimally, I would have presets for the ranges that feel "easy", "medium", and "hard".
This also doesn't capture the fact that some letter sequences are harder to come up with than others. Maybe this is a spurious observation. Or maybe it would be worthwhile to score the secret word on how common its 2-grams and 3-grams are.
Design Space Modeling
Another approach would be to take inspiration from Adam Smith's "Design Space Modeling" concept and attempt to model the human player. This is what I thought of first before developing the simpler heuristic described above. The model I had in mind was something along the lines of using the word frequency data to estimate how much effort/time an "average" human player would take to come up with a word that matches the known constraints and repeat until the model guessed the right word, averaging over multiple runs. This is complicated by a few issues:
Only guessing words that match the known constraints is not always the best strategy. If you think you need more hints, then it might be better to try to guess a word that is likely to give as much information as possible, even if you're certain it's not the final answer. The human model needs to model a strategy (or strategies?), making it more complicated.
In order to make this into a single difficulty score, we need some way to combine the individual difficulties of coming up with each guess into the difficulty of the overall puzzle. Does a puzzle that takes 6 guesses that each take the player 10 seconds to come up with really feel as difficult as a puzzle that takes a single guess that takes a minute to come up with? Probably not. If the combination is non-linear, what's the right function?
How do we even meaningfully come up with a number for "how difficult it is to think up a word"? Probably some function of the frequency or rank in the common words list. But, then again, if we have some of the letters, maybe words that sound more like how they are spelled are easier to guess.
In the end, I just simplified this whole model to "puzzles are easier if there's more valid guesses in the common words list". But perhaps a more advanced difficulty algorithm would use more of these ideas.
Any other ideas on difficulty metrics for Anagram Bagels puzzles? Let me know. Or clone the code and try them out.
Related to generating puzzles of a given difficulty, games often offer a sequence of levels of increasing difficulty, which would require fairly high-fidelity difficulty estimation to do meaningfully. On the other hand, it's unclear that's really necessary for this game to be fun to play.