commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Something Something <mailinglist...@gmail.com>
Subject Re: Why not BigDecimal?
Date Sat, 13 Feb 2010 00:15:52 GMT
Okay... Let's not worry about R, BigDecimal & precision for time being.  I
might have been looking at wrong values.  So let's hold that thought.

Let's take a simple example for getting Y-Hat values using Multiple
Regression given in this PDF:
http://www.utdallas.edu/~herve/abdi-prc-pretty.pdf

I created a small CSV called, students.csv that contains the following data:

s1 14 4 1
s2 23 4 2
s3 30 7 2
s4 50 7 4
s5 39 10 3
s6 67 10 6

Col headers:  Student id, Memory span(Y), age(X1), speech rate(X2)

Now the expected results are:

yHat[0]:15.166666666666668
yHat[1]:24.666666666666668
yHat[2]:27.666666666666664
yHat[3]:46.666666666666664
yHat[4]:40.166666666666664
yHat[5]:68.66666666666667

This is based on the following equation (given in the PDF):  Y = 1.67 + X1 +
9.50 X2

I wrote the following small quick and dirty code to
use OLSMultipleLinearRegression.  The 'calculateHat()' method returns a
RealMatrix, but I can't see the above results in there.  Am I using this
class correctly?  Please let me know.  Thanks.



private static void regression1() {
double[][] X = new double[6][2];
double[] Y = new double[6];
try {
File file = new File("C:\\students.csv");
FileReader reader = new FileReader(file);
BufferedReader in = new BufferedReader(reader);
String line;
 int count = 0;
        while ((line = in.readLine()) != null) {
//        System.out.println(line);
        Scanner scanner = new Scanner(line);
        scanner.useDelimiter(" ");
        String[] cols = new String[4];
        int col = 0;
        while (scanner.hasNext()) {
            cols[col++] = scanner.next();
        }
            Y[count] = Double.valueOf(cols[1]);
            X[count] [0] = Double.valueOf(cols[2]);
            X[count] [1] = Double.valueOf(cols[3]);
            count++;
         }
         in.close();
         reader.close();
       } catch (IOException e) {
         e.printStackTrace();
       }
       OLSMultipleLinearRegression regression = new
OLSMultipleLinearRegression();
       regression.newSampleData(Y, X);
       RealMatrix matrix = regression.calculateHat();
       System.out.println("matrix:" + matrix.getColumnDimension());
}


On Fri, Feb 12, 2010 at 12:08 PM, Ted Dunning <ted.dunning@gmail.com> wrote:

> It is not a precision issue.  R and commons-math use different algorithms
> with the same underlying numerical implementation.
>
> It is even an open question which result is better.  R has lots of
> credibility, but I have found cases where it lacked precision (and I coded
> up a patch that was accepted).
>
> Unbounded precision integers and rationals are very useful, but not usually
> for large scale numerical programming.  Except in a very few cases, if you
> need more than 17 digits of precision, you have other very serious problems
> that precision won't help.
>
> On Fri, Feb 12, 2010 at 1:40 AM, Andy Turner <A.G.D.Turner@leeds.ac.uk
> >wrote:
>
> > Interesting that this is a precision issue. I'm not surprised depending
> on
> > what you are doing, double precision may not be enough. It depends a lot
> on
> > how the calculations are broken into smaller parts. BigDecimal is
> > fantastically useful...
> >
>
>
>
> --
> Ted Dunning, CTO
> DeepDyve
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message