commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark R. Diggory" <mdigg...@latte.harvard.edu>
Subject Re: [math] UnivariateImpl statistical computation strategies
Date Tue, 17 Jun 2003 14:15:48 GMT
Phil Steitz wrote:
> --- "Mark R. Diggory" <mdiggory@latte.harvard.edu> wrote:
>>(1) Bean etiquette suggests "getters" are for bean properties, its 
>>usually recommended that  this means that they do nothing more than 
>>return the value for a property. 
> 
> 
> This is certainly not specified anywhere in the Javabeans spec.  In fact, the
> spec explicitly states (sect 7.1) "So properties need not just be simple data
> fields, they can actually be computed values. Updates
> may have various programmatic side effects."  If the "etiguette" above were in
> fact standard, entity EJBs, for example, would be impossible.  The power of the
> javabeans specification is that it is an interface specification, not an
> implementation specification.  Beans can and should manage their internal state
> and the mapping between their internals and their publicly exposed properties
> in the most convenient and efficient way possible.  
> 
> This is beneficial in our Univariate 
> 

I'm sorry, I'm not really talking about the spec, just a general trend 
in design of Java Beans that I've observed and kinda been "trained" to 
do. So, if its against the spec even, I suspect I should change my 
view-point.


>>case when calling a getter many times without adding a new value (lets 
>>say you use "getKurtosis" allot in a calculation before adding another 
>>value), then its more logical to have the kurtosis only calculated once 
>>and put the code for calculating it in the addValue method.
>>
> 
> Huh?  Kurtosis is only defined for the versions that store all values.  If and
> when we implement the corrected two-pass formulas, these may benefit from some
> running sum computations; but for now, all computations should be performed on
> demand, using the vector of stored values.  There is no reason to keep updating
> as the values are added for the stored case.
> 

You should really review UnivariateImpl, I implemented memory free 
versions Kurtosis and and Skew quite some time ago. Now, I'm working on 
improving their accuracy through application similar to Wests algorithm 
for them. These are just "moments" there is no reason that they can't 
benifit from the same approach as variance.

> 
>>(2) However, If calling addValue many times (more likely the case) with 
>>only the interest of getting the "getMean" back, its wasted 
>>computational time to calculate all the other Stats (like kurtosis) in 
>>addValue when you just want the results of "getMean" back after each 
>>"addValue".
> 
> 
> Yes.  The stored versions should use array-based computations, computing
> statistics on demand in the getters.
> 

+1 and I've made these changes.

>>
>>p.s. In a more complex approach the user might be able to tune the 
>>calculations given thier specific need. But this would require the 
>>creation of a delegation framework and boolean switching to control the 
>>behavior of the Implementation, allot of added complexity that would 
>>need to be maintained, it could create more work than its worth.
> 
> 
> -1


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Mime
View raw message