A Weird Imagination

Title filtering for Liferea

Posted in

Liferea is a desktop news aggregator (sometimes called an RSS reader). Unlike the late Google Reader or most of its alternatives like the open-source Tiny Tiny RSS which are web-based and run on a server to be accessed via a web browser, Liferea is a separate desktop application and uses an embedded browser to view content.

The problem#

Sometimes you don't actually care about all of the items in a feed and the site provides no filtering mechanism. If the uninteresting items are rare enough, you can just ignore them, but a news aggregator is most useful if it only notifies you of news items you actually might want to read.

The solution#

Luckily, Liferea is very flexible. It supports running a command on a feed which it calls a conversion filter. I wrote some python scripts to filter feeds by title locally.

For instance, I wanted to follow only the changelog posts in the forum feed http://braceyourselfgames.com/forums/feed.php, but it includes changes to all forum topics, so I checked the Use conversion filter option and set the conversion filter to

/path/to/atom_filter_title.py --whitelist "Re: Change log"

Read more…

Setting up rTorrent

Posted in

rTorrent is a text-based BitTorrent client, which makes it convenient to leave running in a screen or tmux session, so you don't have to leave a terminal window open and you can access it remotely over ssh. It also has an API for web frontends if you don't like text.

Basic setup#

You can set it up to automatically start and stop downloads based on placing .torrent files into a watch/ directory by putting the following in your ~/.rtorrent.rc:

# Default session directory. Make sure you don't run multiple instance
# of rtorrent using the same session directory. Perhaps using a
# relative path?
session = ./session

# Watch a directory for new torrents, and stop those that have been
# deleted.
schedule = watch_directory,5,5,load_start=./watch/*.torrent

Those settings also use a session directory to keep track of torrents across runs of rTorrent, which is useful if you have a lot of torrents and want to be able to restart rTorrent, say, after rebooting your computer. Note rTorrent will complain if the session directory doesn't already exist, so your first run will look like

$ screen
$ mkdir session watch
$ rtorrent

That configuration uses relative paths for watch/ and session/ so you can have multiple instances of rTorrent in different directories.

magnet: links#

In additional to .torrent files, BitTorrent also supports magnet: links as a way to join a torrent without needing a file. There is built-in support for magnet: links in rTorrent, but it requires a little extra work to make clicking one in a web browser start the download in rTorrent. Here's a script for doing so along with instructions for having your web browser use it to handle magnet: links. I modified it to handle multiple watch/ directories:

#!/bin/bash

DEFAULT_WATCH='/path/to/your/watch'
if [[ $# -ge 2 ]]
then
    WATCH="$2"
else
    if [[ -z "$DISPLAY" ]]
    then
        WATCH="$DEFAULT_WATCH"
    else
        WATCH=$(zenity --file-selection --directory --title="Select rtorrent watch directory" --filename="$DEFAULT_WATCH")
        [[ "$(basename "$WATCH")" = watch ]] || exit;
    fi
fi
cd "$WATCH"
[[ $1 =~ xt=urn:btih:([^&/]+) ]] || exit;
echo "d10:magnet-uri${#1}:${1}e" > "meta-${BASH_REMATCH[1]}.torrent"

This script uses bash because it uses the bash-only =~ operator for regular expression matching.

This script has a hard-coded default directory to use, but supports either specifying a different directory as the second argument or will use zenity to show a dialog asking the user to select a watch/ directory. zenity is quite useful for easily adding interactivity to shell scripts, especially for something like a directory chooser which doesn't work as well in text.

SSH multiplexing

Posted in

The problem#

If you are making a lot of SSH connections, starting each connection can add noticeable overhead. Even worse, a firewall might start blocking the connections as many SSH connections from the same source looks a lot like an attacker trying to guess a password, as one of my officemates discovered recently.

The solution#

SSH has a feature called multiplexing, which is described in this blog post, along with a few other useful SSH tips. Here's the relevant excerpt:

In a shell:

$ mkdir -p ~/.ssh/connections
$ chmod 700 ~/.ssh/connections

Add this to your ~/.ssh/config file:

Host *
ControlMaster auto
ControlPath ~/.ssh/connections/%r_%h_%p

The details#

While ssh is often used as just a secure version of telnet, it's actually closer to being a VPN system, supporting many channels of communication over the same encrypted link, which is how port forwarding over SSH is implemented.

Normally SSH makes a connection and opens a single channel for the terminal. Multiplexing merely means keeping that connection open for additional terminal channels. The settings described tell SSH to keep track of open connections in ~/.ssh/connections/ and automatically reuse an open connection whenever possible.

The firewall#

The firewall which caused this post to get written was keeping track of how many new SSH connections were made to a host and only allow a maximum of 3 new connections each minute. As the firewall was not paying attention to whether the connections were accepted, my officemate's script which performed multiple copies and remote commands was getting blocked.

Logging online status

Posted in

The problem#

I used to have an occasionally unreliable internet connection. I wanted logs of exactly how unreliable it was and an easy way to have notice when it was back up.

The solution#

Use cron to check online status once a minute and write the result to a file. An easy way to check is to confirm that google.com will reply to a ping (this does give a false negative in the unlikely event that Google is down).

To run a script every minute, put a file in /etc/cron.d containing the line

* * * * * root /root/bin/online-check

where /root/bin/online-check is the following script:

#!/bin/sh

# Check if computer is online by attempting to ping google.com.
PING_RESULT="`ping -c 2 google.com 2>/dev/null`"
if [ $? -eq 0 ] && ! echo "$PING_RESULT" | grep -F '64 bytes from 192.168.' >/dev/null 2>/dev/null
then
    ONLINE="online"
else
    ONLINE="offline"
fi
echo "`date '+%Y-%m-%d %T%z'` $ONLINE" >> /var/log/online.log

The details and pretty printing#

Read more…

Child process not in ps?

Posted in

A buggy program#

Consider the following (contrived) program1 which starts a background process to create a file and then waits while the background process is still running before checking to see if the file exists:

#!/bin/sh

# Make sure file doesn't exist.
rm -f file

# Create file in a background process.
touch file &
# While there is a touch process running...
while ps -C "touch" > /dev/null
do
    # ... wait one second for it to complete.
    sleep 1
done
# Check if file was created.
if [ -f file ]
then
    echo "Of course it worked."
else
    echo "Huh? File wasn't created."
    # Wait for background tasks to complete.
    wait
    if [ -f file ]
    then
        echo "Now it's there!"
    else
        echo "File never created."
    fi
fi

# Clean up.
rm -f file

Naturally, it will always output "Of course it worked.", right? Run it in a terminal yourself to confirm this. But I claimed this program is buggy; there's more going on.

Read more…

Out of inodes, what now?

Posted in

When you start getting disk full messages on Linux, there's a few different reasons why that might happen:

  1. The expected. Too many large files. You can track down large directories using WinDirStat or

    du -hx --max-depth=1 | sort -h
    where the -x option tells du to not cross filesystem boundaries and the -h option to both uses human-readable sizes like 11M or 1G.

  2. Deleted files aren't actually deleted if they are still open. You can use lsof to find open files. Give it the filesystem as an argument like lsof /home.

  3. By default 5% of each filesystem is reserved for writes by root. Depending on what the filesystem is being used for, this may be too much or simply unnecessary. See this Server Fault answer for how to deal with this.

  4. The files could be shadowed by a mount. If a filesystem is mounted over a non-empty directory, the files in that directory aren't visible.

  5. Last, the disk might not actually be out of space at all. It might actually be out of inodes. Some filesystems, notably the ext2/3/4 filesystems used by default on most Linux distributions have a fixed number of inodes allocated at filesystem creation time. The default is high enough that it is unlikely to be an issue unless there are a very large number of empty files. df -i will show the number of inodes free on each filesystem to verify if a filesystem is indeed out of inodes.

    But how do you find those empty files? As described above, du will help find large files, but now we want to find large numbers of files. The following command acts like du -hx --max-depth=$depth | sort -h for inodes instead of file sizes:

    find -xdev | sed "s@\(\([^/]*/\)\{$depth\}[^/]*\).*@\1@" | uniq -c | sort -n
    

    find -xdev lists all of the files under the current directory on the same filesystem. The sed command finds the first $depth directories (ending in /) and discards the rest of the filename (the .* at the end), so each directory appears once for every file or directory anywhere under it. Then the end of the command counts the repeated lines and sorts by those counts, highlighting the directories with the most files.

Transferring many small files

Posted in

The problem#

Transferring many small files is much slower than you would expect given their total size.

The solution#

tar c directory | pv -abrt | ssh target 'cd destination; tar x'

or

cd destination; ssh source tar c directory | pv -abrt | tar x

The details#

Read more…

3 comments