commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark R. Diggory" <>
Subject Re: [math] Apples or Oranges
Date Mon, 16 Jun 2003 19:40:00 GMT
Tim O'Brien wrote:

>On Mon, 16 Jun 2003, Mark R. Diggory wrote:
>>(1) Bean etiquette suggests "getters" are for bean properties, its 
>>usually recommended that  this means that they do nothing more than 
>>return the value for a property. This is beneficial in our Univariate 
>>case when calling a getter many times without adding a new value (lets 
>>say you use "getKurtosis" allot in a calculation before adding another 
>>value), then its more logical to have the kurtosis only calculated once 
>>and put the code for calculating it in the addValue method.
>These objects are not JavaBeans, but using getXXX naming standards does
>provide some benefits (say create a Univariate instance and reference it
>from EL, Velocity, etc...).  I don't see any problems violating the 
>standard for bean properties as these are not really "properties".
Yes, just as long as we all agree that these are not really Java Beans, 
then I'm ok with it too.

>>(2) However, If calling addValue many times (more likely the case) with 
>>only the interest of getting the "getMean" back, its wasted 
>>computational time to calculate all the other Stats (like kurtosis) in 
>>addValue when you just want the results of "getMean" back after each 
>It is important to remember that in some of the stored univariate 
>instances the storage medium is external to the Univariate instance.  In 
>those cases, I don't see us being able to consolidate any of our 
>calculations in addValue().  In other words, ListUnivariateImpl is imply 
>attached to an external List - a user can go ahead and add 100 values to 
>that list without ListUnivariateImpl's involvement.
I'm talking strictly about UnivariateImpl at this time, I'm not quite 
ready to delve into the Storage Implementations. I understand and value 
the benifit of what your pointing out. Storage based Univariate 
Implementations have different requirements than "UnivariateImpl" from 
this standpoint. But, I do think some aspects of what Andreou is point 
out could optimize those implementations in the future too. I could be 
possible to establish a sort of "concurrentModification" style attack in 
addValue such that if the underlying List or Array was modified, it 
could be detected by the the Univariate Implementation and such a 
"caching" mechanism could be updated (I'm not sure though, this may not 
be something to explore before reaching release).

Andreou Andreas wrote:

> Mark, I would go for the latter approach (the one on the p.s.) cause 
> it doesn't seem that complex to me...
> Why not add a CachableUnivariateImpl class
> that extends UnivariateImpl
> and also keeps track in a cache the results of the getters (getMean, 
> getKurtosis, e.t.c.).
> In this way, whenever a new value is added, the cache will be cleared, 
> and on calling the getters, each correspending statistic will be
> recalculated.
> If no new values have been added, this new subclass will just return 
> the cached results... 

Yes, I think this is a novel idea to explore in the future, its 
difficult to draw the lines on what to store in it because at this time, 
we are now calculating the mean/variance in addValue with Al's new 
2-pass algorithm, while the more complex kurt and skew calculations are 
in the getter methods. But, I like the idea of it. I'm working on 2-pass 
style algorithms for skew and kurt now. Which may unfortunately require 
more calculation to occur in addValue than I want to see happening.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message