commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark R. Diggory" <mdigg...@latte.harvard.edu>
Subject Re: Math.pow usage was: Re: cvs commit: ...
Date Wed, 18 Jun 2003 21:43:58 GMT
Thanks for entertaining my sometimes naive questioning,

J.Pietschmann wrote:

> Mark R. Diggory wrote:
>
>> (1) Does it seem logical that when working with "n" (or 
>> values.length) to use Math.pow(n, x), as positive integers, the risk 
>> is actually 'integer overflow' when the array representing the number 
>> of cases gets very large, for which the log implementation of 
>> Math.pow would help retain greater numerical accuracy?
>
>
> No. If you cast the base into a double there is not much risk of
> overflow: double x = n;  y=x*x; or y=((double)n)*((double)n);
> or even y=n*(double)n; (but avoid y=(double)n*n).
> Double mantissa has IIRC 52 bits, this should be good for integers
> up to 2^26=67108864 without loss of precision. 

Wow, thats a great clarification, I understand your defense now. It 
would be best to cast n to double asap and always have * operating on 
doubles, so doing things like consolidating the casting of n like this

((double) (n*(n - 1)))

is a poor choice, if I understand correctly, its wiser to do at least

((double) n) * (n - 1)

but unnecessary to go the extreme of:

((double)n) * (((double)n) - 1.0)

> OTOH the log is calculated
> for x=m*2^e with 0.5<=m<1 as e*log(2)+log(m), where log(m) can be
> calculated to full precision by a process called pseudodivision,
> see http://www.math.uni-bremen.de/~thielema/Research/arithmetics.pdf,
> however, for 2^26 you'll loose log(26*log(2))/log(2)~4 bits due to
> the first summand (the same effect happens for very small bases,
> like 1E-16). The exponentiation amplifies the error, although I'm not
> sure by how much.
> Well, with some luck the processor uses its guarding bits to cover
> the precision loss mentioned above (on i386, FP registers have a 64 bit
> denormalized mantissa in contrast to the 52 bit normalized mantissa for
> double). Strict math and IEEE 854 conformant hardware IIRC will discard
> them though.
>
>> (2) In the opposite case from below, where values[i] are very large, 
>> doesn't the log based Math.pow(values[i], 2.0) again work in our 
>> favor to reduce overflow? Seems a catch22.
>
>
> If you are dealing with floating point numbers, your concern is
> loss of precision, not overflow. Apart from this, I don't understand
> in what sense the log based Math.pow(values[i], 2.0) should be
> favorable. If ther's precision loss for x*x, there will be at least
> the same presicion loss for Math.pow(values[i], 2.0), because at least
> the same number of bits will be missing from the mantissa. 

Again, its starting be obvious that I had some bad assumptions about 
floating point arith. and any benefits from "e*log(2)+log(m)" 
calculations in Math.pow. Your discussion has convinced me that its 
usage isn't that great a benefit in terms of any numerical stability in 
terms of working with (doubles).

But, now, I'm a little more confused, on a different subject I(we) took 
on this exp*log strategy for calculating the geometric mean as:

    /**
     * Returns the sum of the natural logs for this collection of values
     * @param values Is a double[] containing the values
     * @return the sumLog value or Double.NaN if the array is empty
     */
    public static double sumLog(double[] values) {
        double sumLog = Double.NaN;
        if (values.length > 0) {
            sumLog = 0.0;
            for (int i = 0; i < values.length; i++) {
                sumLog += Math.log(values[i]);
            }
        }
        return sumLog;
    }

    /**
     * Returns the geometric mean for this collection of values
     * @param values Is a double[] containing the values
     * @return the geometric mean or Double.NaN if the array is empty or
     * any of the values are &lt;= 0.
     */
    public static double geometricMean(double[] values) {
        return Math.exp(sumLog(values) / (double) values.length);
    }

I'm not sure, but isn't this applying a similar exp*log approach found 
in math.pow.  How would you interpret this solution over the what we had 
before? We approached the above because of concerns that (in the below 
example) as the product in "product *= values[i]" gets very large, that 
Math.pow(product(values),(1.0/values.length)); would then introduce a 
loss of precision as "1.0/values.length" gets smaller and smaller. But, 
doesn't the above algorithm introduce a  loss of precision when 
values[i] are very small values as well (similar to what you are 
describing in Math.pow(...))?

    /**
     * Returns the product for this collection of values
     * @param values Is a double[] containing the values
     * @return the product values or Double.NaN if the array is empty
     */
    public static double product(double[] values) {
        double product = Double.NaN;
        if( values.length > 0 ) {
            product = 1.0;
            for( int i = 0; i < values.length; i++) {
                product *= values[i];
            }
        }
        return product;
    }
   
    /**
     * Returns the geometric mean for this collection of values
     * @param values Is a double[] containing the values
     * @return the geometric mean or Double.NaN if the array is empty or
     * any of the values are &lt;= 0.
     */
    public static double geometricMean(double[] values) {
        return Math.pow(product(values),(1.0/values.length));
    }


-Mark



---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Mime
View raw message