commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phil Steitz" <p...@steitz.com>
Subject Re: [math] log representaion of sums was:Re: [math] Priorities, help needed
Date Sat, 24 May 2003 05:13:10 GMT
Brent Worden wrote:
>>One more thing.  Before deciding to change implementation, it would be
>>nice to run some benchmarks (or get some definitive references) to see
>>what the performance difference will be. I suspect that the sum of logs
>>approach may actually be slower, but I have no idea by how much.
>>
>>Phil
> 
> 
> Agreed.  I would like to add that I think we're a little overly concerned
> about the actual implementation of the algorithm.  In these early stages of
> the project, I think it's wiser to spend time discussing the evolving design
> and API.  In the end, that is how people will judge the value of this
> project.  People will care far less about how rock-solid the geometric mean
> algorithm is compared to how many features does it provide and how easy is
> it to use.

I could not agree more.  I have been using (and sharing) the original, 
no-storage, no-rolling version of Univariate for a couple of years now 
and have found it to be simple, lightweight and easy to use.  That is 
why I contributed it.  The only thing that I think we really need to 
worry about as we get the initial release together is that we carefully 
document the interfaces and the contracts -- otherwise the stuff will 
not be usable -- and maintain implementation quality.  We should try to 
avoid stupid things and really bad numerical algorithms, but I agree 
that our focus should be on getting basic, easy to use, frequently 
demanded functionality into the package.  Regarding Univariate in 
particular, my feeling is that the most important things to get in there 
are percentiles and confidence intervals.  These are what people 
actually use (beyond the arithmetic mean and variance).

Have you looked at the task list here:
http://jakarta.apache.org/commons/sandbox/math/tasks.html?

Do you have a) comments on these / alternative suggestions  b) code to 
contribute or c) time to spend helping with implementation?

I am completing testing of a simple "one pass" bivariate regression 
implementation -- another lightweight thingy that I have found very 
useful as it has followed be around (through 5 languages) over the 
years. I was planning to circle back to the RealMatrix implementation 
next, but if you want to take a stab at that or anything else, please do.

Obviously, any additional feedback that you have on what is already out 
there would be appreciated.


Phil

> 
> These discussions will eventually need to take place, but I don't think now
> is the time.  The geometric mean works now for most all valid data sets and
> works well enough in terms of accuracy.  The only reason now to change it is
> if the Univariate implementation design changes, requiring rework of all the
> statistics.
> 
> Brent Worden
> http://www.brent.worden.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
> 




---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Mime
View raw message