A Weird Imagination

Floats in shell

The problem#

Given a file which contains a list of floating point numbers in IEEE 754 single-precision format stored in big endian byte order, how do you view and manipulate this data using command-line tools? This is an actual problem one of my officemates had.

The solution#

$ od --endian=big -f file
0000000   1.7155696e-07   1.0432226e-08    4.563314e+30    6.162976e-33

Read more…

Type your SSH passphrase less often

Posted in

The problem#

SSH public key authenication can make SSH much more convenient because you do not need to type a password for every SSH connect. Instead, in common usage, the first time you connect, GNOME Keyring, KWallet, or your desktop environment's equivalent will pop up and offer to keep your decrypted private key securely in memory. Those programs will remember your key until the next time you reboot your computer (or possibly until you log out completely and log back in).

But those are tied to your desktop environment. If you are not at a GUI, either using a computer in text-mode using a console or connecting over SSH, then you do not have access to those programs.

Read more…

256 color hostnames

The problem#

When using multiple terminals on different hosts, it can sometimes be confusing to remember which host you are on. The hostname appears in the command prompt, but it's easy to skim past that if you are not paying attention.

One solution that works pretty well for me is recoloring the prompt based on what host I am on. This is in fact why I researched how to get 256 colors terminals working in the first place: in order to have enough colors to be able to make a good choice for each host I use frequently.

Read more…

256 color terminals

The problem#

By default, terminals on Linux only use 8 colors (or 16 if setup to use bright variants instead of bold text). Everything else on a modern computer uses 24-bit color, allowing for millions of colors. More colors in the terminal would allow for better syntax highlighting and color output of various commands to be more readable.

In practice, while a few terminals support full 24-bit RGB color (at least Konsole does), it is not widespread enough to be used much. On the other hand, most terminals support 256 colors, which is significantly better than just 8.

Read more…

Compile on save

Posted in

The problem#

When developing code or creating visual artifacts in non-WYSIWYG systems, it is very useful to constantly be aware of the output of the compiler and the appearance of the artifact you are creating, whether it is a GUI, a chart, a graph, or a paper. The common way of doing this is to have an IDE specialized for the system you are using; for example, LyX provides a WYSIWYG editor for LaTeX. Similarly, there may be plugins for your text editor to support whatever kind of development you are doing. On the other hand, we can use the shell to create a solution independent of the text editor and the availability of plugins for the particular system being developed for.

Read more…

Checking for unsafe shell constructs

Posted in

Filenames are troublesome#

While shell programing lets you write very concise programs, it turns out that the primary use case of working with files is unfortunately much harder than it seems. That detailed article by David A. Wheeler does a good job of explaining all of the various problems that a naive shell script can run into due to various characters which are allowed in filenames which the shell treats specially in some way.

One surprising one is that filenames beginning with a dash (-) can be interpreted as options due to the way globbing works in the shell. Suppose we set up a directory as follows:

$ cat > -n
Some secret text.
$ cat > test
This is a test.
It has multiple lines.

Quick, what will cat * do here?

$ cat *
     1  This is a test.
     2  It has multiple lines.

Probably not what you wanted. The reason that happens is that the * is expanded by the shell before being fed to cat, so the command executed is cat -n test and -n gets interpreted not as a filename but as an option telling cat to number the lines of the output.

The workaround is to use ./* instead of *, so the - will not actually be the first character and therefore will not get misinterpreted as an option. But there are many other things that can go wrong with unexpected filenames and remembering to handle all of them everywhere is error-prone.

Warnings for unsafe shell code#

The solution is shellcheck. shellcheck will warn you about mistakes like the cat * problem and many other issues you may not be aware of.

As I have many shellscripts around that I wrote before learning about shellcheck, I wanted to run it on all of the shell scripts (but not binaries or other language scripts) in my ~/bin directory, so naturally I wrote a script to do so:

#!/bin/sh

find -exec file {} \; \
    | grep -F 'shell script' \
    | sed s/:[^:]*$// \
    | xargs shellcheck

This uses the file command to identify shell scripts and then selects out their file names to run shellcheck on all of them using xargs.

Warnings in Vim#

shellcheck is written to support integration into IDEs. I use Vim to edit shell scripts, so I installed the syntastic (using Vundle which makes installing Vim plugins off GitHub very easy). Note to follow the instructions on the Syntastic page, specifically the recommended settings: without any settings it won't do anything at all. Once set up, it automatically runs shellcheck on every save, identifies lines with warnings and shows a list of warnings that can be double-clicked to jump to the location of the warning.

If you use the other text editor, then the shellcheck website recommends the flycheck plugin.

SSH multiplexing options

Posted in

In my previous post on SSH multiplexing, I gave the following to add to your ~/.ssh/config file without explaining what it actually means:

Host *
ControlMaster auto
ControlPath ~/.ssh/connections/%r_%h_%p

The documentation for ~/.ssh/config can be found at ssh_config(5). Four options are relevant to this post:

Host
The config file is broken up in to sections based on which hosts the configuration options apply to. Host * means these options apply to all connections. If you wanted the options to apply only when connecting to example.com, you could change that line to Host example.com.
Actually, you can also limit configuration options by things other than just the host using the Match directive. For example, this configuration has options for connecting with the username git, presumably due to having multiple git servers that use that username.
ControlMaster
Tells ssh to use multiplexing. Specifically, the default is no, which means it will look for an already open master connection. To actually open a master connection, yes or ask can be used, the latter means that a password prompt will appear when connecting to that master connection. The more useful options for a config file are the auto and autoask options which will use an already open connection if exists, but fall back to acting like yes and ask respectively otherwise.
ControlPath
In order to connect to the master connection, ssh needs a way to communicate with it. This is handled by the master creating a Unix socket which future ssh instances look for. Unix sockets are an IPC mechanism which allows two processes on the same machine to communicate via a connection initiated by one process creating a socket identified by a filename and another using that special file to connect. In comparison with TCP, every server needs its own port number that the client needs to know and any client can connect as long as it knows the port number.
Unix sockets are identified by filenames and ControlPath specifies the filename to use for the socket The %r, %h, %p parts mean the filename should include the remote username, hostname, and port number in order to identify which ssh session is which.
This should usually be enough, but if your home directory is shared among multiple computers, as is common in some university and other large organization setups, then you will also need %l to identify which host you are connecting from. Otherwise ssh may get confused by master connections created by a different host. Luckily, ssh provides a shortcut, which is the %C option which is a hash of all 4 (although it is not available on older versions of ssh):
ControlPath ~/.ssh/connections/%C

or, if you are a disto which does not have %C yet like the latest Ubuntu LTS:

ControlPath ~/.ssh/connections/%L_%h_%p_%r

I used %L for the short version of the local hostname (for example, if %l is foo.example.com, %L would be just foo) because when I used %l, my system complained the filename was too long.

Keep in mind that the socket file is security critical because it is used to piggyback on your existing ssh sessions without authenticating (unless you use the ask or autoask options for ControlMaster), so make sure your ~/.ssh/connections/ directory is readable only by you:

chmod 700 ~/.ssh/connections
ControlPersist
Not used above, the ControlPersist option lets you control when the master connection actually closed. When set to no, it closes with the initial connect. When set to yes it stays open until explicitly closed with ssh -O exit. It can also be set to a length of time to stay open after the last connection is closed.
While the default of ControlPersist is not clearly stated in the documentation, I checked the source code to confirm it does default to no: the default value is set to 0 here if it is still set to its initial value of -1, which is the same value it is given if the configuration file says no.

Twitter via RSS

Posted in

Twitter no longer offers an RSS feed. That thread offers a few workarounds which involve external or non-free services or require creating a Twitter account. One of those external services, TwitRSS.me is open-source with its code on GitHub. This code can be run locally to view Twitter streams in Liferea (or any other news aggregator) without relying on an external service.

Specifically, the Perl script twitter_user_to_rss.pl is the relevant part. It's intended to be used on a webserver, so the output includes HTTP headers:

Content-type: application/rss+xml
Cache-control: max-age=1800

<?xml version="1.0" encoding="UTF-8"?>
...

which can be cleaned out with tail in the script twitter_user_to_rss_file, which assumes it's in the same directory as twitter_user_to_rss.pl:

#!/bin/sh
"$(dirname "$0")/twitter_user_to_rss.pl" "user=$1&replies=1" \
    | tail -n +4

twitter_user_to_rss_file also handles the argument format of the script, so it just takes a single argument which is the Twitter username. The replies=1 part tells the script to use the Tweets & replies view which includes tweets that begin with @.

When creating a subscription in Liferea, the advanced options include a choice of source type. To use the script, set the source type to Command and the source to

/path/to/twitter_user_to_rss_file username

My version of twitter_user_to_rss.pl includes a few differences from the original that make it a bit more usable. Most importantly, links are made into actual links (based on this code), images are included in the feed content, tweets are marked with their creator to make it easier to follow retweets and combinations of tweets from multiple feeds together in a single stream.

Title filtering for Liferea

Posted in

Liferea is a desktop news aggregator (sometimes called an RSS reader). Unlike the late Google Reader or most of its alternatives like the open-source Tiny Tiny RSS which are web-based and run on a server to be accessed via a web browser, Liferea is a separate desktop application and uses an embedded browser to view content.

The problem#

Sometimes you don't actually care about all of the items in a feed and the site provides no filtering mechanism. If the uninteresting items are rare enough, you can just ignore them, but a news aggregator is most useful if it only notifies you of news items you actually might want to read.

The solution#

Luckily, Liferea is very flexible. It supports running a command on a feed which it calls a conversion filter. I wrote some python scripts to filter feeds by title locally.

For instance, I wanted to follow only the changelog posts in the forum feed http://braceyourselfgames.com/forums/feed.php, but it includes changes to all forum topics, so I checked the Use conversion filter option and set the conversion filter to

/path/to/atom_filter_title.py --whitelist "Re: Change log"

Read more…

Setting up rTorrent

Posted in

rTorrent is a text-based BitTorrent client, which makes it convenient to leave running in a screen or tmux session, so you don't have to leave a terminal window open and you can access it remotely over ssh. It also has an API for web frontends if you don't like text.

Basic setup#

You can set it up to automatically start and stop downloads based on placing .torrent files into a watch/ directory by putting the following in your ~/.rtorrent.rc:

# Default session directory. Make sure you don't run multiple instance
# of rtorrent using the same session directory. Perhaps using a
# relative path?
session = ./session

# Watch a directory for new torrents, and stop those that have been
# deleted.
schedule = watch_directory,5,5,load_start=./watch/*.torrent

Those settings also use a session directory to keep track of torrents across runs of rTorrent, which is useful if you have a lot of torrents and want to be able to restart rTorrent, say, after rebooting your computer. Note rTorrent will complain if the session directory doesn't already exist, so your first run will look like

$ screen
$ mkdir session watch
$ rtorrent

That configuration uses relative paths for watch/ and session/ so you can have multiple instances of rTorrent in different directories.

magnet: links#

In additional to .torrent files, BitTorrent also supports magnet: links as a way to join a torrent without needing a file. There is built-in support for magnet: links in rTorrent, but it requires a little extra work to make clicking one in a web browser start the download in rTorrent. Here's a script for doing so along with instructions for having your web browser use it to handle magnet: links. I modified it to handle multiple watch/ directories:

#!/bin/bash

DEFAULT_WATCH='/path/to/your/watch'
if [[ $# -ge 2 ]]
then
    WATCH="$2"
else
    if [[ -z "$DISPLAY" ]]
    then
        WATCH="$DEFAULT_WATCH"
    else
        WATCH=$(zenity --file-selection --directory --title="Select rtorrent watch directory" --filename="$DEFAULT_WATCH")
        [[ "$(basename "$WATCH")" = watch ]] || exit;
    fi
fi
cd "$WATCH"
[[ $1 =~ xt=urn:btih:([^&/]+) ]] || exit;
echo "d10:magnet-uri${#1}:${1}e" > "meta-${BASH_REMATCH[1]}.torrent"

This script uses bash because it uses the bash-only =~ operator for regular expression matching.

This script has a hard-coded default directory to use, but supports either specifying a different directory as the second argument or will use zenity to show a dialog asking the user to select a watch/ directory. zenity is quite useful for easily adding interactivity to shell scripts, especially for something like a directory chooser which doesn't work as well in text.