A Weird Imagination

Better hash-based colors

The problem#

Yesterday, I proposed using a hash function to choose colors:

ps1_color="32;38;5;$((0x$(hostname | md5sum | cut -f1 -d' ' | tr -d '\n' | tail -c2)))"

I did not discuss two issues with this approach:

  1. Colors may be too similar to other colors, such that they are not useful.
  2. Some colors may be undesirable altogether, particularly very dark ones may be too similar to a black terminal background color.

Read more…

Hash-based hostname colors

Random color selection#

In my post about hostname-based prompt colors, I suggested a fallback color scheme that was obviously wrong in order to remind you to set a color for that host:

alice@unknown:~$ 

This carried with it an implicit assumption: you care what color each host is assigned. You may instead be happy to assign a random color to each host. We could use shuf to generate a random color:

ps1_color="32;38;5;$(shuf -i 0-255 -n 1)"

The problem with this solution is the goal of the recoloring the prompt was not simply to make it more colorful, but for that color to have meaning. We want the color to always be the same for each login to a given host.

One way to accomplish this would be to use that code to randomly generate colors, but save the results in a table like the one used before for manually-chosen colors. But it turns out we can do better.

Hash-based color selection#

Hash functions have a useful property called determinism, which means that hashing the same value will always get the same result. The consequence is that we can use a hash function like it's a lookup table of random numbers shared among all of our computers:

ps1_color="32;38;5;$(($(hostname | sum | cut -f1 -d' ' | sed s/^0*//) % 256))"

The $((...)) syntax is bash's replacement for expr which is less portable but easier to use. Here we use it to make sure the hash value we compute is a number between 0 and 255. [sum][sum] computes a hash of its input, in this case the result of hostname. Its output is not just a number so cut selects out the number and sed gets rid of any leading zeros so it isn't misinterpreted as octal.

The idea of using sum was suggested by a friend after reading my previous post on the topic.

But this turns out to not work great for hosts with similar names like rob.example.com and orb.example.com:

alice@rob:~$ 
alice@orb:~$ 

Similar colors on hosts with very different names would not be so bad, but because of how sum works, it will tend to give similar results on similar strings (although less often than I expected; it took some effort to find such an example).

Better hash functions#

While this is not a security-critical application, here cryptographic hash functions solve the problem. Cryptographic hash functions guarantee (in theory) that knowing that two inputs are similar tells you nothing about their hash values. In other words, the output of cryptographic hash functions are indistinguishable from random and, in fact, they can be used to build pseudorandom generators like Linux's /dev/urandom.

The cryptographic hash function utilities output hex instead of decimal, so they aren't quite a drop-in replacement for sum:

ps1_color="32;38;5;$((0x$(hostname | md5sum | cut -f1 -d' ' | tr -d '\n' | tail -c2)))"

Here we use cut and tr to select just the hex string of the hash. tail's -c option specifies the number of bytes to read from the end, where 2 bytes corresponds to 2 hex digits, which can have a value of 0 to 255, so the modulo operation is not needed. Instead the 0x prefix inside $((...)) interprets the string as a hex number and outputs it as a decimal number.

This code uses the md5sum utility to compute an MD5 hash of the hostname. This is recommended because md5sum is likely to be available on all hosts. Do be aware that MD5 is insecure and it is only okay to use here because coloring the prompt is not a security-critical application.

sha1sum and sha256sum are also likely available on modern systems and work as drop-in replacements for md5sum in the above command should you wish to use a different hash. Additionally, you could also get different values out of the hash by adding a salt:

salt="Some string."
ps1_color="32;38;5;$((0x$( (echo "$salt"; hostname) | sha256sum | cut -f1 -d' ' | tr -d '\n' | tail -c2)))"

Floats in shell

The problem#

Given a file which contains a list of floating point numbers in IEEE 754 single-precision format stored in big endian byte order, how do you view and manipulate this data using command-line tools? This is an actual problem one of my officemates had.

The solution#

$ od --endian=big -f file
0000000   1.7155696e-07   1.0432226e-08    4.563314e+30    6.162976e-33

Read more…

Type your SSH passphrase less often

Posted in

The problem#

SSH public key authenication can make SSH much more convenient because you do not need to type a password for every SSH connect. Instead, in common usage, the first time you connect, GNOME Keyring, KWallet, or your desktop environment's equivalent will pop up and offer to keep your decrypted private key securely in memory. Those programs will remember your key until the next time you reboot your computer (or possibly until you log out completely and log back in).

But those are tied to your desktop environment. If you are not at a GUI, either using a computer in text-mode using a console or connecting over SSH, then you do not have access to those programs.

Read more…

256 color hostnames

The problem#

When using multiple terminals on different hosts, it can sometimes be confusing to remember which host you are on. The hostname appears in the command prompt, but it's easy to skim past that if you are not paying attention.

One solution that works pretty well for me is recoloring the prompt based on what host I am on. This is in fact why I researched how to get 256 colors terminals working in the first place: in order to have enough colors to be able to make a good choice for each host I use frequently.

Read more…

256 color terminals

The problem#

By default, terminals on Linux only use 8 colors (or 16 if setup to use bright variants instead of bold text). Everything else on a modern computer uses 24-bit color, allowing for millions of colors. More colors in the terminal would allow for better syntax highlighting and color output of various commands to be more readable.

In practice, while a few terminals support full 24-bit RGB color (at least Konsole does), it is not widespread enough to be used much. On the other hand, most terminals support 256 colors, which is significantly better than just 8.

Read more…

Compile on save

Posted in

The problem#

When developing code or creating visual artifacts in non-WYSIWYG systems, it is very useful to constantly be aware of the output of the compiler and the appearance of the artifact you are creating, whether it is a GUI, a chart, a graph, or a paper. The common way of doing this is to have an IDE specialized for the system you are using; for example, LyX provides a WYSIWYG editor for LaTeX. Similarly, there may be plugins for your text editor to support whatever kind of development you are doing. On the other hand, we can use the shell to create a solution independent of the text editor and the availability of plugins for the particular system being developed for.

Read more…

Checking for unsafe shell constructs

Posted in

Filenames are troublesome#

While shell programing lets you write very concise programs, it turns out that the primary use case of working with files is unfortunately much harder than it seems. That detailed article by David A. Wheeler does a good job of explaining all of the various problems that a naive shell script can run into due to various characters which are allowed in filenames which the shell treats specially in some way.

One surprising one is that filenames beginning with a dash (-) can be interpreted as options due to the way globbing works in the shell. Suppose we set up a directory as follows:

$ cat > -n
Some secret text.
$ cat > test
This is a test.
It has multiple lines.

Quick, what will cat * do here?

$ cat *
     1  This is a test.
     2  It has multiple lines.

Probably not what you wanted. The reason that happens is that the * is expanded by the shell before being fed to cat, so the command executed is cat -n test and -n gets interpreted not as a filename but as an option telling cat to number the lines of the output.

The workaround is to use ./* instead of *, so the - will not actually be the first character and therefore will not get misinterpreted as an option. But there are many other things that can go wrong with unexpected filenames and remembering to handle all of them everywhere is error-prone.

Warnings for unsafe shell code#

The solution is shellcheck. shellcheck will warn you about mistakes like the cat * problem and many other issues you may not be aware of.

As I have many shellscripts around that I wrote before learning about shellcheck, I wanted to run it on all of the shell scripts (but not binaries or other language scripts) in my ~/bin directory, so naturally I wrote a script to do so:

#!/bin/sh

find -exec file {} \; \
    | grep -F 'shell script' \
    | sed s/:[^:]*$// \
    | xargs shellcheck

This uses the file command to identify shell scripts and then selects out their file names to run shellcheck on all of them using xargs.

Warnings in Vim#

shellcheck is written to support integration into IDEs. I use Vim to edit shell scripts, so I installed the syntastic (using Vundle which makes installing Vim plugins off GitHub very easy). Note to follow the instructions on the Syntastic page, specifically the recommended settings: without any settings it won't do anything at all. Once set up, it automatically runs shellcheck on every save, identifies lines with warnings and shows a list of warnings that can be double-clicked to jump to the location of the warning.

If you use the other text editor, then the shellcheck website recommends the flycheck plugin.

SSH multiplexing options

Posted in

In my previous post on SSH multiplexing, I gave the following to add to your ~/.ssh/config file without explaining what it actually means:

Host *
ControlMaster auto
ControlPath ~/.ssh/connections/%r_%h_%p

The documentation for ~/.ssh/config can be found at ssh_config(5). Four options are relevant to this post:

Host
The config file is broken up in to sections based on which hosts the configuration options apply to. Host * means these options apply to all connections. If you wanted the options to apply only when connecting to example.com, you could change that line to Host example.com.
Actually, you can also limit configuration options by things other than just the host using the Match directive. For example, this configuration has options for connecting with the username git, presumably due to having multiple git servers that use that username.
ControlMaster
Tells ssh to use multiplexing. Specifically, the default is no, which means it will look for an already open master connection. To actually open a master connection, yes or ask can be used, the latter means that a password prompt will appear when connecting to that master connection. The more useful options for a config file are the auto and autoask options which will use an already open connection if exists, but fall back to acting like yes and ask respectively otherwise.
ControlPath
In order to connect to the master connection, ssh needs a way to communicate with it. This is handled by the master creating a Unix socket which future ssh instances look for. Unix sockets are an IPC mechanism which allows two processes on the same machine to communicate via a connection initiated by one process creating a socket identified by a filename and another using that special file to connect. In comparison with TCP, every server needs its own port number that the client needs to know and any client can connect as long as it knows the port number.
Unix sockets are identified by filenames and ControlPath specifies the filename to use for the socket The %r, %h, %p parts mean the filename should include the remote username, hostname, and port number in order to identify which ssh session is which.
This should usually be enough, but if your home directory is shared among multiple computers, as is common in some university and other large organization setups, then you will also need %l to identify which host you are connecting from. Otherwise ssh may get confused by master connections created by a different host. Luckily, ssh provides a shortcut, which is the %C option which is a hash of all 4 (although it is not available on older versions of ssh):
ControlPath ~/.ssh/connections/%C

or, if you are a disto which does not have %C yet like the latest Ubuntu LTS:

ControlPath ~/.ssh/connections/%L_%h_%p_%r

I used %L for the short version of the local hostname (for example, if %l is foo.example.com, %L would be just foo) because when I used %l, my system complained the filename was too long.

Keep in mind that the socket file is security critical because it is used to piggyback on your existing ssh sessions without authenticating (unless you use the ask or autoask options for ControlMaster), so make sure your ~/.ssh/connections/ directory is readable only by you:

chmod 700 ~/.ssh/connections
ControlPersist
Not used above, the ControlPersist option lets you control when the master connection actually closed. When set to no, it closes with the initial connect. When set to yes it stays open until explicitly closed with ssh -O exit. It can also be set to a length of time to stay open after the last connection is closed.
While the default of ControlPersist is not clearly stated in the documentation, I checked the source code to confirm it does default to no: the default value is set to 0 here if it is still set to its initial value of -1, which is the same value it is given if the configuration file says no.

Twitter via RSS

Posted in

Twitter no longer offers an RSS feed. That thread offers a few workarounds which involve external or non-free services or require creating a Twitter account. One of those external services, TwitRSS.me is open-source with its code on GitHub. This code can be run locally to view Twitter streams in Liferea (or any other news aggregator) without relying on an external service.

Specifically, the Perl script twitter_user_to_rss.pl is the relevant part. It's intended to be used on a webserver, so the output includes HTTP headers:

Content-type: application/rss+xml
Cache-control: max-age=1800

<?xml version="1.0" encoding="UTF-8"?>
...

which can be cleaned out with tail in the script twitter_user_to_rss_file, which assumes it's in the same directory as twitter_user_to_rss.pl:

#!/bin/sh
"$(dirname "$0")/twitter_user_to_rss.pl" "user=$1&replies=1" \
    | tail -n +4

twitter_user_to_rss_file also handles the argument format of the script, so it just takes a single argument which is the Twitter username. The replies=1 part tells the script to use the Tweets & replies view which includes tweets that begin with @.

When creating a subscription in Liferea, the advanced options include a choice of source type. To use the script, set the source type to Command and the source to

/path/to/twitter_user_to_rss_file username

My version of twitter_user_to_rss.pl includes a few differences from the original that make it a bit more usable. Most importantly, links are made into actual links (based on this code), images are included in the feed content, tweets are marked with their creator to make it easier to follow retweets and combinations of tweets from multiple feeds together in a single stream.