The problem#
I mostly use Gnumeric for my personal spreadsheets, which
are generally quite simple. But recently I wanted do some analysis
that required writing a formula more complicated than just using the
SUM()
function and found Gnumeric's function documentation
quite lacking in sufficiently detailed explanations and examples to be
able to figure out how to use any of the more complicated functions.
The specific problem I was trying to solve was that I had some data on payments over time labeled with categories and wanted to summarize payments by category for each time period.
The solution#
The short version is that Gnumeric's formula language is nearly the same as Microsoft Excel's formula language, so there's no need to look for Gnumeric-specific help on writing formulas. There's plenty of advice online on writing formulas for Excel, and the information transfers without modification to writing formulas for Gnumeric.
For my specific problem, my layout was that each column was a time
period and each row was a payee with column A
being the name of the
payee and column B
being the category. I made entries further down in
column A
of the categories and wanted the cells in those rows to be
the sum of the payments for the corresponding column's time period but
only for the payees matching the category for the row. The function to
do that is called SUMIF()
, but that documentation
page from Gnumeric is very unhelpful except for the line
This function is Excel compatible.
The Excel help page on SUMIF()
has examples and a lot
more explanation. Here's what the formula looks like in cell P50
:
=SUMIF($B$2:$B$40,$A50,P$2:P$40)
where B2:B40
is the range where the categories are given for each
payee (named in A2:A40
), A50
contains the category being summed, and
P2:P40
contains the actual values being conditionally summed (i.e.,
the values for the time period named in P1
).
The details#
Excel compatibility#
The main observation is that for most spreadsheet functions, most spreadsheet programs, including Gnumeric, act the same as Microsoft Excel. Since it's the standard spreadsheet program, all of the others aim for compatibility which includes copying the core function set. Gnumeric claims to support
all of the functions from the North American edition of Microsoft Excel 2000 (TM).
The same applies to LibreOffice Calc, although its documentation is much more detailed than Gnumeric's, more like Excel's. While LibreOffice Calc is significantly more popular then Gnumeric (after all, Gnumeric only runs on Linux and similar operating systems) so LibreOffice-specific help is likely easier to find, in general when trying to get help designing a spreadsheet formula, it may be worth looking for Excel-specific help even when not using Excel.
Formula details#
The formula is a little unintuitive. The main issue is that when
thinking of SUMIF()
as a modified SUM()
, the range
being summed is both the source of the values and the cells being
conditioned on. But for this usage, the cells being conditioned on
are actually the categories in column B
and there's no way in
the condition to refer to a different cell. In a more expressive
programming language, you might expect to be able to write a function
that gets passed the value and index of each cell, but the "criteria"
in spreadsheet formulas are much more limited than that. So, instead,
SUMIF()
lets you optionally specify two separate ranges: the first
range is the cell to condition on, and the second parallel range is the
cells with the values to actually sum if the condition is true on the
corresponding cell. The middle argument is the condition, for which a
cell reference means to check for string equality between that cell and
the cell being checked.
Making it draggable#
That means, for every sum, the condition will be on the category names
in column B
, so that range B2:B40
in the first argument will be
identical for every sum. Adding $
before a column/row referent
in a formula makes it not change when dragging the formula, so that's
written $B$2:$B$40
. The data does change depending on which column
the sum is in, so we leave off the $
before the P
since that's the
column being summed, but keep it on the rows because which row the sum
is just changes which category is being summed, not where the data is
coming from: P$2:P$40
. And that last detail is why the middle argument
looks like $A50
: the category to sum is in the first column so the A
doesn't change when dragging, but which category does so there's no $
before the 50
. To repeat the above, all together that gives us the
formula
=SUMIF($B$2:$B$40,$A50,P$2:P$40)
Pivot tables#
The observant reader may have noticed the problem I wanted to solve is already solved by pivot tables. Or you may just be correctly assuming that most simple summarization tasks can be expressed using pivot tables. And I have used a pivot table to do a very similar task in a different spreadsheet.
Mainly, I didn't use a pivot table here because I wanted the summary to be displayed alongside the data.
Comments
Have something to add? Post a comment by sending an email to comments@aweirdimagination.net. You may use Markdown for formatting.
There are no comments yet.