commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luc Maisonobe <Luc.Maison...@free.fr>
Subject [all][nabla] proposition for a new project in sandbox
Date Sun, 13 Apr 2008 19:22:56 GMT
Hello,

I have played with an idea for a new project for a few months. Asking 
for a few advices both at the ApacheCon Europe and by direct contact, 
all responses I received were quite positive and suggested me to set up 
a component in sandbox. This message is the first public announcement 
and is intended to collect the opinion of all the commons community 
about this project. In short: can I play in sandbox with this or should 
I find another place for it ? Another possibility would be to put it 
inside [math], but that would be really strange.

The project already has a name: Nabla, which is an operator used in 
mathematics and physics for differentiation. It is a simple triangle 
pointing downwards (see http://mathworld.wolfram.com/Nabla.html). Lets 
call the component I want to develop [nabla] from now, to match our 
local habits here. There is some code for it, but only developed by 
myself on my spare time with my personal computer and never distributed 
to anyone. So I can consider I developed it under Apache umbrella and 
put it on sandbox with the Apache headers and license. I am already a 
commons committer and have filed an Individual Contributor License 
Agreement to Apache.

[nabla] will be a mathematics/physics library aimed at building the 
symbolic differentiation of any function provided as a bytecode compiled 
function.

Here is a typical use case for such a library. For some simulation 
purposes, suppose I use a class with a method computing the consumption 
of performing an action as a function of its start time:

public class DifficultComputation {
   public double f(double t) {
     // some lengthy equations here
   }
}

Now in addition to computing the consumption by itself, I want to be 
able to compute the sensitivity of this consumption to start time 
changes. This would allow me to say: if action is started at t = 10 
seconds, then consumption will be 1.2 kilograms, and this consumption 
will increase by 10 grams for each second I delay the start. The value 
10 grams per second of delay is computing by differentiating the 
original equation. There are several ways to do that.

The first way relies on by mathematical transformations on the equations 
implemented in the function f. It it implies mathematical analysis and 
new development which is very error-prone (computing the differential of 
a function is much more complex than computing the function itself). It 
is only feasible if you know the equations or have the source code of 
the function. This approach may be used with symbolic computation 
packages like Mathematica, Axiom where you develop your equations using 
these programs, and have them generate the implementation for you. 
However, the produce code is only for some languages (typically fortran 
and C), it is awful and cannot be maintained (it is not intended to be), 
and needs to be integrated with the rest of the application which is 
already a difficult task.

The second way is using numerical finite-differences schemes. These 
algorithms basically compute several values by changing the start time 
by a small known amount and looking at the various results. This implies 
setting up the step, which may be difficult if you don't already know 
the behavior of the function (should I use one microsecond or one 
century here, in fact it depends on the problem). This is also either 
quite computation intensive if you use high order schemes with 4, 6 or 8 
points or inaccurate if you don't use them. It is also impossible to use 
too close to functions boundaries which are often locations were we 
really want to explore.

[nabla] provides a third way to get this result. It analyses the 
bytecode of the function at run time, performs the exact symbolic 
mathematical transforms, and generates a new class implementing the 
differentiated function. There is still a computation cost, but it is 
the same you would get from a manually differentiated code, plus a one 
time bytecode differentiation overhead (but we can also cache results).

This approach has the following benefits:
  - derivation is exact
  - there are no problem-dependent step size to handle
  - derivation can be computed even at domains boundaries
  - there is no special handling of source
    (no symbolic package with its own language, no source code
     generation, no integration with the rest of application)
  - one writes and maintains only the basic equation and get the
    derivative for free
  - it is effective even when source code is not available (but there
    are licensing issues in this case of course, since what I do
    automatically is really ... derived work)

The only drawback I see is that functions calling native code cannot be 
handled. In this case, we have a fallback available with 
finite-differences schemes.

The existing implementation is not yet ready for production. A lot of 
work has been done, but there are many missing features. [nabla] can 
handle simple functions from end to end (i.e. up to creating an instance 
of the differentiated class that is fully functional). Making this code 
available in the sandbox would allow to let people look at it, comment 
on it, participate if they are interested and make it go live.

What do you think about it ?
Luc


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message