The problem#
The command-line is an expressive interface which allows powerful
commands to be written concisely. Sometimes you want a longer, less
direct way of implementing a task. For example, merely writing
wc -l
is far too straightforward for counting lines in
a file. Surely we can devise a more convoluted way to accomplish that
task.
The solution#
cat "$file" |
expr $(od -t x1 |
sed 's/ /\n/g' |
grep '^0a$' |
sed -z 's/\n//g' |
wc -c) / 2
The details#
Inspired by the concept of Rube Goldberg machines, sh Rube Goldbergs are a silly game to explore the capabilities of commands that you might otherwise not encounter. I cannot take full credit/blame for the idea: it was suggested by one of my officemates who also came up with some of the examples in this post.
Simpler examples#
Before explaining the example above, I'll first cover a couple shorter
Rube Goldbergs for wc -l
, which inspired that longer one:
cat "$file" | tr -cd '\n' | wc -c
uses tr
to delete all of the non-newline characters and then
counts the characters using wc
.
cat "$file" | nl -ba | tail -1 | cut -f1
uses nl
to label all (-ba
) of the lines with line
numbers. Then tail
selects the last line (-1
) and
cut
selects just the line number from that line.
The larger example#
To understand what's going on, it's helpful to chop off the end
of the pipeline and look at the output. Because this command uses
$(...)
, there's not an obvious end to chop off. The
outside just feeds the input file in and uses expr
to
divide the inner result by 2, so the following computes double the
number of lines in $file
:
cat "$file" |
od -t x1 |
sed 's/ /\n/g' |
grep '^0a$' |
sed -z 's/\n//g' |
wc -c
Look at the output of each step on a small file. The first step
od
shows a view similar to a hex editor. The -t x1
flag
displays spaces between all of the bytes. The output looks like
0000000 54 69 74 6c 65 3a 20 73 68 20 52 75 62 65 20 47
0000020 6f 6c 64 62 65 72 67 73 0a 44 61 74 65 3a 20 32
0000040 30 31 35 2d 30 32 2d 31 35 20 30 32 3a 30 30 0a
Next, sed
is used to split this into lines at each space.
The grep
command selects only those lines which contain
the hex sequence for a newline (
). The next 0a
sed
command
joins all of the lines together, so the result is a line that looks like
0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a
Then wc
is used to count the number of characters. Since each
newline is two hex digits, this results in counting double the number of
newlines.
That number is computed inside $(...)
, which is an sh
feature
called command substitution
, which means the outer
command is executed like the result of the inner command were pasted in
the place of the inner command. Here we use this to divide the result by
2 using expr
.
Further efforts#
Got any more convoluted sh Rube Goldbergs? Or other tasks to try to implement in the least efficient way possible? Have fun!
Comments
Have something to add? Post a comment by sending an email to comments@aweirdimagination.net. You may use Markdown for formatting.
There are no comments yet.