A Weird Imagination

Shell script over characters

Posted in

The problem#

I wanted to find a specific Unicode character I had used somewhere in some previous blog post. But by the nature of not knowing exactly what character it was, I wasn't sure how to search for it.

The solution#

Instead, I wrote a script based on this post to simply list all of the characters appearing in any file in a given directory:

cat * | sed 's/./&\n/g' | sort -u

Although the output is small, to further reduce the noise, this version strips out the English letters, numbers, and common symbols:

cat * \
    | tr -d 'a-zA-Z0-9!@#$%^&*()_+=`~,./?;:"[]{}<>|\\'"'-" \
    | sed 's/./&\n/g' | sort -u

The details#

Read more…

Checking for unsafe shell constructs

Posted in

Filenames are troublesome#

While shell programing lets you write very concise programs, it turns out that the primary use case of working with files is unfortunately much harder than it seems. That detailed article by David A. Wheeler does a good job of explaining all of the various problems that a naive shell script can run into due to various characters which are allowed in filenames which the shell treats specially in some way.

One surprising one is that filenames beginning with a dash (-) can be interpreted as options due to the way globbing works in the shell. Suppose we set up a directory as follows:

$ cat > -n
Some secret text.
$ cat > test
This is a test.
It has multiple lines.

Quick, what will cat * do here?

$ cat *
     1  This is a test.
     2  It has multiple lines.

Probably not what you wanted. The reason that happens is that the * is expanded by the shell before being fed to cat, so the command executed is cat -n test and -n gets interpreted not as a filename but as an option telling cat to number the lines of the output.

The workaround is to use ./* instead of *, so the - will not actually be the first character and therefore will not get misinterpreted as an option. But there are many other things that can go wrong with unexpected filenames and remembering to handle all of them everywhere is error-prone.

Warnings for unsafe shell code#

The solution is shellcheck. shellcheck will warn you about mistakes like the cat * problem and many other issues you may not be aware of.

As I have many shellscripts around that I wrote before learning about shellcheck, I wanted to run it on all of the shell scripts (but not binaries or other language scripts) in my ~/bin directory, so naturally I wrote a script to do so:

#!/bin/sh

find -exec file {} \; \
    | grep -F 'shell script' \
    | sed s/:[^:]*$// \
    | xargs shellcheck

This uses the file command to identify shell scripts and then selects out their file names to run shellcheck on all of them using xargs.

Warnings in Vim#

shellcheck is written to support integration into IDEs. I use Vim to edit shell scripts, so I installed the syntastic (using Vundle which makes installing Vim plugins off GitHub very easy). Note to follow the instructions on the Syntastic page, specifically the recommended settings: without any settings it won't do anything at all. Once set up, it automatically runs shellcheck on every save, identifies lines with warnings and shows a list of warnings that can be double-clicked to jump to the location of the warning.

If you use the other text editor, then the shellcheck website recommends the flycheck plugin.