incubator-ooo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Weir <>
Subject Re: [From] Parallel functions in OpenOffice spreadsheet
Date Fri, 19 Aug 2011 15:56:55 GMT
> ---------------------------------------
>    From: dennis <>
>    To:
>    Subject: [dev] Parallel functions in OpenOffice spreadsheet
>    Date: Fri, 12 Aug 2011 21:23:42 +0300
> Hello,
> My name is Dennis Groisman and I am a student at Ben Gurion University
> in Be'er Sheva, Israel.

Hi Dennis -- As you've probably noticed, I've forwarded your note to
the ooo-dev list at Apache.  We'll try to keep you copied on this
thread.  You are also welcome to subscribe to the list by sending an
email to :

> I'm studying electronics and computer engineering. Me and my
> colleague, Tal Benach, are starting a project that involves speeding
> up spreadsheet functions with long computation time by parallel run on
> more than one core or even on a computing cloud/grid.
> We would be very glad if you could assist us by answering some of our questions:
> * Do you have any information about projects similar to ours?

I'm not aware of similar work.  But there may be three reasonable approaches:

1) Parallelize the pre-defined spreadsheet functions, e.g. SUM(),
AVERAGE(), etc.

2) Parallelize the calculation of the formulas in individual cells,
e.g. determine a topological sort of the cell dependencies and then
dispatch calculation of the un-calculated leaf cells to multiple

3) Decompose a spreadsheet into multiple graphs of cells that are part
of the same calculation chain, and then calculate the independent
graphs in parallel.

> * Where can we get a list of functions implemented for OpenOffice?

You can find a list here:

> * Do you know about "slow" functions that need a speed-up in
> spreadsheet OpenOffice?

I don't know of any such analysis.    But you can see the wiki page on
the Calc performance work here:

Generally, a spreadsheet is intended to be interactive from an
end-user's perspective.  So any calculation that has a delay more than
a second or two is annoying.  The hard part is to maintain that
responsiveness as users scale from hundreds of rows of data to
hundreds of thousands of rows,

So, there are a few ways of looking at this:

1) Look at the most used functions, like, SUM, AVERAGE, IF, etc.
These will be frequent in large spreadsheets as well, so they will
have a large influence on overall calc time in those cases.

2) Look at the functions that are more specialized, but which lend
themselves to large speedups from parallel execution, e.g., SUMSQ:

3) Find a large spreadsheet file that is particularly slow, and optimize that.

The formulas that you might think are intrinsically expensive to
calculate, the like Bessel functions, etc., are rare in spreadsheets.
And they don't operate on large ranges of cells.  So my guess is that
the most common functions, used in large spreadsheets, would benefit
most from taking advantage of multi-core.

The grid/cloud opportunity is less clear.  In most cases, calculations
are interactive, 1-2 seconds top.  There are some less-common cases,
such as doing linear or non-linear programming and other constrained
optimization problems, typically done via add-ins.  These can take
several minutes or even hours to run on large models. These might be
good candidates for cloud/grid.  But this is not the typical case for
the core calculation code.

> * Which OS is better when working in OpenOffice? Windows or Linux?

It is easier to build OOo on Linux or Mac than on Windows.

> * Where can we get the most up-to-date version of a spreadsheet source code?

OOo is stored in Mercurial here:

You might also look at these instructions, which allows you to add a
new spreadsheet function to OOo.  This might be good for

> We both thank you for your assistance,
> Dennis Groisman and Tal Benach

View raw message