Floats in shell

The problem#

Given a file which contains a list of floating point numbers in IEEE 754 single-precision format stored in big endian byte order, how do you view and manipulate this data using command-line tools? This is an actual problem one of my officemates had.

The solution#

$ od --endian=big -f file
0000000   1.7155696e-07   1.0432226e-08    4.563314e+30    6.162976e-33

The details#

Binary in shell#

The shell has a lot of useful utilities centered around processing text, often line-by-line. As previously discussed, shell programs can run into trouble when the text they are dealing with is has weird characters in it. And there weird meant characters like -, space, and newline.

So what happens when we want to process binary data? For the most part, the shell is simply the wrong tool for the job, but sometimes it's nice to be able to use the shell for quick tasks anyway.

There are three programs for printing binary data in a text format: hexdump/hd, od, and xxd all with different options and different output formats.1

If you want to edit binary files, xxd's -r option will read in strings formatted like its output and write the changes back to a file and hexer is a command-line interactive editor for binary files which is designed for users familiar with vi (or vim).

Handling floats#

The options for od include -f, short for -t fF, which means to interpret the file as floats. The F means single-precision float, as opposed to D for double-precision float.

For handling byte-order, od also has a --endian=big option, which is necessary because nearly all modern systems are natively little endian. Or, at least, newer versions do. The version in Ubuntu's April 2014 release does not have that flag, so it may not be available on your system.

As a work-around, Python's struct package offers an easy way to deal with binary data. As an example, the following script takes a filename as an argument, interprets its first 4 bytes as a big-endian float, and prints out that float:

#!/usr/bin/env python

import struct
import sys

with(open(sys.argv[1], 'rb')) as input:
    f = '>f' # big-endian float
    print(struct.unpack(f, input.read(struct.calcsize(f))))

  1. Out of curiosity of why there are so many tools to do a single task, I checked what package each came from using wajig's whickpkg subcommand:

    $ wajig whichpkg $(which od)
    coreutils: /usr/bin/od
    $ wajig whichpkg $(which hexdump)
    bsdmainutils: /usr/bin/hexdump
    $ wajig whichpkg $(which xxd)
    vim-common: /usr/bin/xxd

    So it appears that the GNU and BSD projects each have their own, and xxd is part of the Vim text editor. 


