commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark R. Diggory" <>
Subject Re: [math] statistics performance boost
Date Thu, 13 May 2004 12:00:41 GMT
Most definitely post any patches. It would be good to post them on our 
bugzilla so that they can be properly tracked.

Ken Geis wrote:
> As I explained, I am using commons-math to enable data mining algorithms 
> I am writing.  I am using a lot of SummaryStatistics and TTest.  Through 
> some profiling, I was able to find places to optimize code and I ended 
> up getting a 15x performance boost within my application.  This was from 
> three changes:
> 1. Add clone() to SummaryStatisticsImpl.  This implies adding clone() to 
> SecondMoment, Sum, SumOfSquares, Min, Max, SumOfLogs, GeometricMean, 
> Mean, and Variance.  To Mark, I think that the behavior of clone() is 
> well implied by the Javadoc for java.lang.Object.  I was surprised that 
> I obviously had not read that before yesterday.  To Phil, your suggested 
> getSummary() method/bean would indeed solve my problem and give me even 
> better performance.  (clone() was ~20x faster than the 
> serialize/deserialize hack I was using.  This probably accounts for 2x 
> of my overall 15x.)

I think we should work on improvements to both clone and getSummary() 

> 2. Change TTestImpl; the commons-discovery DiscoverClass.newInstance() 
> was being called for every call to tTest.  This is not a cheap method. 
> After #1, this method was taking up something like 17% of the runtime of 
> my synthetic benchmark.  I created a method to lazily get the 
> DistributionFactory and store it (transient) as a class attribute.
> 3. Make ContinuedFraction.evaluate(...) iterative instead of recursive. 
>  This gave me a 125% (2.25x) improvement in performance of this method. 
>  I think I can optimize it further, hopefully not at the cost of 
> readability.
> Patches available on request.  Should I just start posting them when I 
> have patches like this?

All of your efforts are greatly appreciated, we will gladly acknowledge 
your efforts as a contributor in the project documentation.

Mark Diggory
Software Developer
Harvard MIT Data Center

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message