commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject cvs commit: jakarta-commons/math/xdocs/userguide stat.xml
Date Wed, 03 Mar 2004 02:32:25 GMT
psteitz     2004/03/02 18:32:25

  Modified:    math/xdocs/userguide stat.xml
  Filled in missing content in univariate statistics section.
  Revision  Changes    Path
  1.10      +92 -11    jakarta-commons/math/xdocs/userguide/stat.xml
  Index: stat.xml
  RCS file: /home/cvs/jakarta-commons/math/xdocs/userguide/stat.xml,v
  retrieving revision 1.9
  retrieving revision 1.10
  diff -u -r1.9 -r1.10
  --- stat.xml	29 Feb 2004 21:25:08 -0000	1.9
  +++ stat.xml	3 Mar 2004 02:32:25 -0000	1.10
  @@ -57,7 +57,7 @@
             all statistics, consists of <code>evaluate()</code> methods that
take double[] arrays as arguments and return 
             the value of the statistic.   This interface is extended by 
             <a href="../apidocs/org/apache/commons/math/stat/univariate/StorelessUnivariateStatistic.html">
  -          org.apache.commons.math.stat.univariate.StorelessUnivariateStatistic,</a>
which adds <code>increment(),</code>
  +          StorelessUnivariateStatistic,</a> which adds <code>increment(),</code>
             <code>getResult()</code> and associated methods to support "storageless"
implementations that
             maintain counters, sums or other state information as values are added using
the <code>increment()</code>
  @@ -65,29 +65,110 @@
             Abstract implementations of the top level interfaces are provided in 
             <a href="../apidocs/org/apache/commons/math/stat/univariate/AbstractUnivariateStatistic.html">
  -          org.apache.commons.math.stat.univariate.AbstractUnivariateStatistic</a>
  +          AbstractUnivariateStatistic</a> and
             <a href="../apidocs/org/apache/commons/math/stat/univariate/AbstractStorelessUnivariateStatistic.html">
  -          org.apache.commons.math.stat.univariate.AbstractStorelessUnivariateStatistic</a>
  +          AbstractStorelessUnivariateStatistic</a> respectively.
             Each statistic is implemented as a separate class, in one of the subpackages
(moment, rank, summary) and
             each extends one of the abstract classes above (depending on whether or not value
storage is required to 
             compute the statistic).
             There are several ways to instantiate and use statistics.  Statistics can be
instantiated and used directly,  but it is
  -          generally more convenient to access them using the provided aggregates: 
  +          generally more convenient (and efficient) to access them using the provided aggregates,
<a href="../apidocs/org/apache/commons/math/stat/DescriptiveStatistics.html">
  +            DescriptiveStatistics</a> and <a href="../apidocs/org/apache/commons/math/stat/SummaryStatistics.html">
  +            SummaryStatistics.</a>  <code>DescriptiveStatistics</code>
maintains the input data in memory and has the capability
  +            of producing "rolling" statistics computed from a "window" consisting of the
most recently added values.  <code>SummaryStatisics</code>
  +            does not store the input data values in memory, so the statistics included
in this aggregate are limited to those that can be
  +            computed in one pass through the data without access to the full array of values.
  +        </p>
  +        <p>
  -            <tr><th>Aggregate</th><th>Statistics Included</th><th>Values
  +            <tr><th>Aggregate</th><th>Statistics Included</th><th>Values
stored?</th><th>"Rolling" capability?</th></tr>
               <tr><td><a href="../apidocs/org/apache/commons/math/stat/DescriptiveStatistics.html">
  -            org.apache.commons.math.stat.DescriptiveStatistics</a></td><td>All</td><td>Yes</td></tr>
  +            DescriptiveStatistics</a></td><td>min, max, mean, geometric
mean, n, sum, sum of squares, standard deviation, variance, percentiles, skewness, kurtosis,
               <tr><td><a href="../apidocs/org/apache/commons/math/stat/SummaryStatistics.html">
  -            org.apache.commons.math.stat.SummaryStatistics</a></td><td>min,
max, mean, geometric mean, n, sum, sum of squares, standard deviation, variance</td><td>No</td></tr>
  +            SummaryStatistics</a></td><td>min, max, mean, geometric mean,
n, sum, sum of squares, standard deviation, variance</td><td>No</td><td>No</td></tr>
  -          TODO: add code sample
  +        </p>
  +        <p>
             There is also a utility class, <a href="../apidocs/org/apache/commons/math/stat/StatUtils.html">
  -           org.apache.commons.math.stat.StatUtils,</a> that provides static methods
for computing statistics
  -           from double[] arrays. 
  +           StatUtils,</a> that provides static methods for computing statistics
  +           directly from double[] arrays. 
  +        <p>
  +          Here are some examples showing how to compute univariate statistics.
  +          <dl>
  +          <dt>Compute summary statistics for a list of double values</dt>
  +          <br></br>
  +          <dd>Using the <code>DescriptiveStatistics</code> aggregate
(values are stored in memory):
  +        <source>
  +// Get a DescriptiveStatistics instance using factory method
  +DescriptiveStatistics stats = DescriptiveStatistics.newInstance(); 
  +// Add the data from the array
  +for( int i = 0; i &lt; inputArray.length; i++) {
  +        stats.addValue(inputArray[i]);
  +// Compute some statistics 
  +double mean = stats.getMean();
  +double std = stats.getStandardDeviation();
  +double median = stats.getMedian();
  +  	  	</source>
  +  	    </dd>
  +  	    <dd>Using the <code>SummaryStatistics</code> aggregate (values
are <strong>not</strong> stored in memory):
  +       <source>
  +// Get a SummaryStatistics instance using factory method
  +SummaryStatistics stats = SummaryStatistics.newInstance(); 
  +// Read data from an input stream, adding values and updating sums, counters, etc. necessary
for stats
  +while (line != null) {
  +        line = in.readLine();
  +        stats.addValue(Double.parseDouble(line.trim()));
  +// Compute the statistics 
  +double mean = stats.getMean();
  +double std = stats.getStandardDeviation();
  +//double median = stats.getMedian(); &lt;-- NOT AVAILABLE in SummaryStatistics
  +  	  	</source>
  +  	    </dd>	
  +  	     <dd>Using the <code>StatUtils</code> utility class:
  +       <source>
  +// Compute statistics directly from the array -- assume values is a double[] array
  +double mean = StatUtils.mean(values);
  +double std = StatUtils.variance(values);
  +double median = StatUtils.percentile(50);
  +// Compute the mean of the first three values in the array 
  +mean = StatuUtils.mean(values, 0, 3); 
  +  	  	</source>
  +  	    </dd>  
  +  	    <dt>Maintain a "rolling mean" of the most recent 100 values from an input
  +  	    <br></br>
  +  	    <dd>Use a <code>DescriptiveStatistics</code> instance with window
size set to 100
  +  	    <source>
  +// Create a DescriptiveStats instance and set the window size to 100
  +DescriptiveStatistics stats = DescriptiveStatistics.newInstance();
  +// Read data from an input stream, displaying the mean of the most recent 100 observations
  +// after every 100 observations
  +long nLines = 0;
  +while (line != null) {
  +        line = in.readLine();
  +        stats.addValue(Double.parseDouble(line.trim()));
  +        if (nLines == 100) {
  +                nLines = 0;
  +                System.out.println(stats.getMean());  // "rolling" mean of most recent
100 values
  +       }
  +  	    </source>
  +  	    </dd>  	    
  +  	    </dl>
  +  	   </p>
         <subsection name="1.3 Frequency distributions" href="frequency">
           <p>This is yet to be written. Any contributions will be gratefully

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message