commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kim van der Linde <>
Subject Re: [math] API changes for RC2
Date Mon, 27 Sep 2004 16:19:40 GMT

Al Chou wrote:

> So someone please lay out a real(istic) use case.  Would you ever make two
> successive (or closely separated, anyway) method calls to get both the sample
> and population result for the same dataset?  Or do you usually just use one and
> not the other? 

Take a simple example in which one has a simulation running. For the 
simulation one would prefer error reduced estimate of the variance in 
order to not multiply small errors in succesive generations, while for 
the simultaneously test whether this generation is indeed different from 
the previous one, or the base generation, you would use bias reduced 
estimates for the statistical estimate. Sure, often you use only one 
variant. The problem is that the number of (co-)variance classes is 
considerable, and all of those have the two(+?) versions, including 
regressions, covariance matrices, PCA...

 > In pseudo-code, do you ever need to do this:
> StandardDeviation sd = new StandardDeviation( ... ) ;
> sd.getResult() ;
> sd.getPopulationResult() ;
> or is it sufficient functionality if you have to say something like:
> StandardDeviation sd = new StandardDeviation( ... ) ;
> sd.Result() ;
> sd.populationResult = true ;
> sd.Result() ;

I would argue against a boolean for this, but either use a int or maybe 
preferably a double (which enables us also to deal with weighted 
calculations in the future). The underlying second moment does not need 
a change, and the only place where it matters is when the division is 
made, literally just before the return is called. Because of that, I 
suggested before the following simple code (example getResult()):

     public double getResult(final int varianceType) {
             if ((moment.n-varianceType) <= 0) {
                 return Double.NaN;
             } else if (moment.n == 1) {
                 return 0d;
             } else {
                 return     moment.m2 /
             ((double) moment.n - varianceType);

     public double getResult() {
         return getResult(VarianceTypes.SAMPLE);

So, anyone NOT interested beyond sample variances can just use the 
defaults, while it gives the flexibility to the rest to get the same 
examples out of the same object. The down side I can see is that it 
increases the number of methods within a class. If that is a point, I 
would set a biasReduction double variable to 1 as default, and allow the 
variable to be set to any value the user chooses. That would be similar 
to Phil's 4).



To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message