commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 26772] New: - [patch] addition of load(double[]) initialization to the EmpiricalDistribution
Date Sun, 08 Feb 2004 16:21:01 GMT
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=26772>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=26772

[patch] addition of load(double[]) initialization to the EmpiricalDistribution

           Summary: [patch] addition of load(double[]) initialization to the
                    EmpiricalDistribution
           Product: Commons
           Version: unspecified
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: Enhancement
          Priority: Other
         Component: Math
        AssignedTo: commons-dev@jakarta.apache.org
        ReportedBy: pi@uw.edu.pl


I propose to add the load(double[]) initialization to the EmpiricalDistribution.
In the submission there are three patches: 
for EmpiricalDistribution, EmpiricalDistributionImpl and 
EmpiricalDistributionTest.

The implementation. Internally all sources of data were represented by
the java.io.BufferedReader. I added private abstract inner class 
(DataAdapter), which abstracts now the source of data. There is also a
simple factory class which initializes proper DataAdapter (separate for
streams and for array of doubles)

I was wondering if it wouldn't be better to transform 
double[] into BufferedReader, so the rest of the code could be
left untouched, but it seems it is not that simple (though possible)
and have some disadvantages, so I gave it up.

Unit test simply repeats already implemented tests using the same
data, represented this time by the double[].

I don't claim, however, that proposed implementation is perfect, so
I would be very glad for comments.

Piotr Kochanski

Index: EmpiricalDistribution.java
===================================================================
RCS file:
/home/cvspublic/jakarta-commons/math/src/java/org/apache/commons/math/random/EmpiricalDistribution.java,v
retrieving revision 1.13
diff -u -r1.13 EmpiricalDistribution.java
--- EmpiricalDistribution.java	25 Jan 2004 21:30:41 -0000	1.13
+++ EmpiricalDistribution.java	8 Feb 2004 16:07:21 -0000
@@ -84,7 +84,14 @@
  * @version $Revision: 1.13 $ $Date: 2004/01/25 21:30:41 $
  */
 public interface EmpiricalDistribution {
-    
+ 
+    /**
+     * Computes the empirical distribution from the provided
+     * array of numbers.
+     * @param dataArray the data array
+     */
+    void load(double[] dataArray); 
+        
     /**
      * Computes the empirical distribution from the input file.
      * @param filePath fully qualified name of a file in the local file system



Index: EmpiricalDistributionImpl.java
===================================================================
RCS file:
/home/cvspublic/jakarta-commons/math/src/java/org/apache/commons/math/random/EmpiricalDistributionImpl.java,v
retrieving revision 1.15
diff -u -r1.15 EmpiricalDistributionImpl.java
--- EmpiricalDistributionImpl.java	29 Jan 2004 06:26:14 -0000	1.15
+++ EmpiricalDistributionImpl.java	8 Feb 2004 16:07:57 -0000
@@ -64,7 +64,6 @@
 import java.io.InputStreamReader;
 import java.net.URL;
 
-import org.apache.commons.math.stat.DescriptiveStatistics;
 import org.apache.commons.math.stat.SummaryStatistics;
 
 /**
@@ -130,12 +129,32 @@
         this.binCount = binCount;
         binStats = new ArrayList();
     }
+
+    /**
+     * @see org.apache.commons.math.random.EmpiricalDistribution#load(double[])
+     */
+    public void load(double[] in){
+        DataAdapter da = new ArrayDataAdapter(in);
+        try {
+            da.computeStats();
+            fillBinStats(in);
+        } catch (Exception e) {
+            throw new RuntimeException(e.getMessage());
+        }
+        loaded = true;
+        
+    }
     
     public void load(String filePath) throws IOException {
         BufferedReader in = 
             new BufferedReader(new InputStreamReader(new
FileInputStream(filePath)));  
         try {
-            computeStats(in);
+            DataAdapter da = new StreamDataAdapter(in);
+            try {
+                da.computeStats();
+            } catch (Exception e) {
+                throw new IOException(e.getMessage());
+            }
             in = new BufferedReader(new InputStreamReader(new
FileInputStream(filePath)));  
             fillBinStats(in);
             loaded = true;
@@ -148,7 +167,12 @@
         BufferedReader in = 
             new BufferedReader(new InputStreamReader(url.openStream()));
         try {
-            computeStats(in);
+            DataAdapter da = new StreamDataAdapter(in);
+            try {
+                da.computeStats();
+            } catch (Exception e) {
+                throw new IOException(e.getMessage());
+            }
             in = new BufferedReader(new InputStreamReader(url.openStream()));
             fillBinStats(in);
             loaded = true;
@@ -160,35 +184,129 @@
     public void load(File file) throws IOException {
         BufferedReader in = new BufferedReader(new FileReader(file));
         try {
-            computeStats(in);
+            DataAdapter da = new StreamDataAdapter(in);
+            try {
+                da.computeStats();
+            } catch (Exception e) {
+                throw new IOException(e.getMessage());
+            }
             in = new BufferedReader(new FileReader(file));
             fillBinStats(in);
             loaded = true;
         } finally {
-           if (in != null) try {in.close();} catch (Exception ex) {};
+            if (in != null)
+                try {
+                    in.close();
+                } catch (Exception ex) {
+                };
         }
     }
     
     /**
-     * Computes sampleStats (first pass through data file).
+     * Provides methods for computing <code>sampleStats</code> and 
+     * <code>beanStats</code> abstracting the source of data. 
+     */
+    private abstract class DataAdapter{
+        public abstract void computeBinStats(double min, double delta) throws
Exception;
+        public abstract void computeStats() throws Exception;
+    }
+    /**
+     * Factory of <code>DataAdapter</code> objects. For every supported source
+     * of data (array of doubles, file, etc.) an instance of the proper object
+     * is returned. 
      */
-    private void computeStats(BufferedReader in) throws IOException {
-        String str = null;
-        double val = 0.0;
-        sampleStats = SummaryStatistics.newInstance();
-        while ((str = in.readLine()) != null) {
-            val = new Double(str).doubleValue();
-            sampleStats.addValue(val);
+    private class DataAdapterFactory{
+        public DataAdapter getAdapter(Object in) {
+            if (in instanceof BufferedReader) {
+                BufferedReader inputStream = (BufferedReader) in;
+                return new StreamDataAdapter(inputStream);
+            } else if (in instanceof double[]) {
+                double[] inputArray = (double[]) in;
+                return new ArrayDataAdapter(inputArray);
+            } else {
+                throw new IllegalArgumentException(
+                    "Input data comes from the" + " unsupported source");
+            }
         }
-        in.close();
-        in = null;
     }
-    
+    /**
+     * <code>DataAdapter</code> for data provided through some input stream
+     */
+    private class StreamDataAdapter extends DataAdapter{
+        BufferedReader inputStream;
+        public StreamDataAdapter(BufferedReader in){
+            super();
+            inputStream = in;
+        }
+        /**
+         * Computes binStats
+         */
+        public void computeBinStats(double min, double delta) throws IOException {
+            String str = null;
+            double val = 0.0d;
+            while ((str = inputStream.readLine()) != null) {
+                val = Double.parseDouble(str);
+                SummaryStatistics stats =
+                    (SummaryStatistics) binStats.get(
+                        Math.max((int) Math.ceil((val - min) / delta) - 1, 0));
+                stats.addValue(val);
+            }
+
+            inputStream.close();
+            inputStream = null;
+        }
+        /**
+         * Computes sampleStats
+         */
+        public void computeStats() throws IOException {
+            String str = null;
+            double val = 0.0;
+            sampleStats = SummaryStatistics.newInstance();
+            while ((str = inputStream.readLine()) != null) {
+                val = new Double(str).doubleValue();
+                sampleStats.addValue(val);
+            }
+            inputStream.close();
+            inputStream = null;
+        }
+    }
+
+    /**
+     * <code>DataAdapter</code> for data provided as array of doubles.
+     */
+    private class ArrayDataAdapter extends DataAdapter{
+        private double[] inputArray;
+        public ArrayDataAdapter(double[] in){
+            super();
+            inputArray = in;
+        }
+        /**
+         * Computes sampleStats
+         */
+        public void computeStats() throws IOException {
+            sampleStats = SummaryStatistics.newInstance();
+            for (int i = 0; i < inputArray.length; i++) {
+                sampleStats.addValue(inputArray[i]);
+            }
+        }
+        /**
+         * Computes binStats
+         */
+        public void computeBinStats(double min, double delta)
+            throws IOException {
+            for (int i = 0; i < inputArray.length; i++) {
+                SummaryStatistics stats =
+                    (SummaryStatistics) binStats.get(
+                        Math.max((int) Math.ceil((inputArray[i] - min) / delta)
- 1, 0));
+                stats.addValue(inputArray[i]);
+            }
+        }    
+    }
+
     /**
      * Fills binStats array (second pass through data file).
      */
-    private void fillBinStats(BufferedReader in) throws IOException {
-        
+    private void fillBinStats(Object in) throws IOException {    
         // Load array of bin upper bounds -- evenly spaced from min - max
         double min = sampleStats.getMin();
         double max = sampleStats.getMax();
@@ -209,19 +327,19 @@
             binStats.add(i,stats);
         }
         
-        // Pass the data again, filling data in binStats Array
-        String str = null;
-        double val = 0.0d;
-        while ((str = in.readLine()) != null) {
-           val = Double.parseDouble(str);
-           SummaryStatistics stats = 
-            (SummaryStatistics) binStats.get(Math.max((int)Math.ceil((val -
min) / delta) - 1, 0));
-           stats.addValue(val);        
+        // Filling data in binStats Array
+        DataAdapterFactory aFactory = new DataAdapterFactory();
+        DataAdapter da = aFactory.getAdapter(in);
+        try {
+            da.computeBinStats(min, delta);
+        } catch (Exception e) {
+            if(e instanceof RuntimeException){
+                throw new RuntimeException(e.getMessage());
+            }else{
+                throw new IOException(e.getMessage());
+            }
         }
         
-        in.close();
-        in = null;
-        
         // Assign upperBounds based on bin counts
         upperBounds = new double[binCount];
         upperBounds[0] =
@@ -303,5 +421,7 @@
     public boolean isLoaded() {
         return loaded;
     }
+
+
     
 }



Index: EmpiricalDistributionTest.java
===================================================================
RCS file:
/home/cvspublic/jakarta-commons/math/src/test/org/apache/commons/math/random/EmpiricalDistributionTest.java,v
retrieving revision 1.12
diff -u -r1.12 EmpiricalDistributionTest.java
--- EmpiricalDistributionTest.java	29 Jan 2004 05:27:54 -0000	1.12
+++ EmpiricalDistributionTest.java	8 Feb 2004 15:58:05 -0000
@@ -56,9 +56,14 @@
 import junit.framework.Test;
 import junit.framework.TestCase;
 import junit.framework.TestSuite;
+
+import java.io.BufferedReader;
 import java.io.File;
+import java.io.IOException;
+import java.io.InputStreamReader;
 import java.net.URL;
-import java.net.URLDecoder;
+import java.util.ArrayList;
+import java.util.Iterator;
 
 import org.apache.commons.math.stat.SummaryStatistics;
 
@@ -71,16 +76,37 @@
 public final class EmpiricalDistributionTest extends TestCase {
 
     protected EmpiricalDistribution empiricalDistribution = null;
+    protected EmpiricalDistribution empiricalDistribution2 = null;
     protected File file = null;
     protected URL url = null; 
+    protected double[] dataArray = null;
     
     public EmpiricalDistributionTest(String name) {
         super(name);
     }
 
-    public void setUp() {
+    public void setUp() throws IOException {
         empiricalDistribution = new EmpiricalDistributionImpl(100);
         url = getClass().getResource("testData.txt");
+        
+        empiricalDistribution2 = new EmpiricalDistributionImpl(100);
+        BufferedReader in = 
+                new BufferedReader(new InputStreamReader(
+                        url.openStream()));
+        String str = null;
+        ArrayList list = new ArrayList();
+        while ((str = in.readLine()) != null) {
+            list.add(Double.valueOf(str));
+        }
+        in.close();
+        in = null;
+        
+        dataArray = new double[list.size()];
+        int i = 0;
+        for (Iterator iter = list.iterator(); iter.hasNext();) {
+            dataArray[i] = ((Double)iter.next()).doubleValue();
+            i++;
+        }                 
     }
 
     public static Test suite() {
@@ -107,7 +133,27 @@
           (empiricalDistribution.getSampleStats().getStandardDeviation(),
                 1.0173699343977738,10E-7);
     }
-    
+
+    /**
+     * Test EmpiricalDistrbution.load(double[]) using data taken from
+     * sample data file.<br> 
+     * Check that the sampleCount, mu and sigma match data in 
+     * the sample data file.
+     */
+    public void testDoubleLoad() throws Exception {
+        empiricalDistribution2.load(dataArray);   
+        // testData File has 10000 values, with mean ~ 5.0, std dev ~ 1
+        // Make sure that loaded distribution matches this
+        assertEquals(empiricalDistribution2.getSampleStats().getN(),1000,10E-7);
+        //TODO: replace with statistical tests
+        assertEquals
+            (empiricalDistribution2.getSampleStats().getMean(),
+                5.069831575018909,10E-7);
+        assertEquals
+          (empiricalDistribution2.getSampleStats().getStandardDeviation(),
+                1.0173699343977738,10E-7);
+    }
+   
     /** 
       * Generate 1000 random values and make sure they look OK.<br>
       * Note that there is a non-zero (but very small) probability that
@@ -115,6 +161,7 @@
       */
     public void testNext() throws Exception {
         tstGen(0.1);
+        tstDoubleGen(0.1);
     }
     
     /**
@@ -124,6 +171,7 @@
     public void testNexFail() {
         try {
             empiricalDistribution.getNextValue();
+            empiricalDistribution2.getNextValue();
             fail("Expecting IllegalStateException");
         } catch (IllegalStateException ex) {;}
     }
@@ -134,6 +182,8 @@
     public void testGridTooFine() throws Exception {
         empiricalDistribution = new EmpiricalDistributionImpl(1001);
         tstGen(0.1);    
+        empiricalDistribution2 = new EmpiricalDistributionImpl(1001);           
+        tstDoubleGen(0.1);
     }
     
     /**
@@ -143,6 +193,8 @@
         empiricalDistribution = new EmpiricalDistributionImpl(1);
         tstGen(5); // ridiculous tolerance; but ridiculous grid size
                    // really just checking to make sure we do not bomb
+        empiricalDistribution2 = new EmpiricalDistributionImpl(1);           
+        tstDoubleGen(5);           
     }
     
     private void tstGen(double tolerance)throws Exception {
@@ -150,6 +202,17 @@
         SummaryStatistics stats = SummaryStatistics.newInstance();
         for (int i = 1; i < 1000; i++) {
             stats.addValue(empiricalDistribution.getNextValue());
+        }
+        assertEquals("mean", stats.getMean(),5.069831575018909,tolerance);
+        assertEquals
+         ("std dev", stats.getStandardDeviation(),1.0173699343977738,tolerance);
+    }
+
+    private void tstDoubleGen(double tolerance)throws Exception {
+        empiricalDistribution2.load(dataArray);   
+        SummaryStatistics stats = SummaryStatistics.newInstance();
+        for (int i = 1; i < 1000; i++) {
+            stats.addValue(empiricalDistribution2.getNextValue());
         }
         assertEquals("mean", stats.getMean(),5.069831575018909,tolerance);
         assertEquals

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Mime
View raw message