commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phil Steitz" <>
Subject Re: [math] Tasks remaining for initial release
Date Mon, 23 Jun 2003 14:08:54 GMT
Phil Steitz wrote:
> Here are the remaining open tasks. I think we are getting close to the 
> time when we can develop a release plan. I will submit a patch to 
> Bugzilla updating tasks.xml, but I wanted to post the open tasks and my 
> own short term plan.
> * RealMatrixImpl is missing one method implementation -- getRank(). The 
> most accurate way to implement this would be to add Singular Value 
> Decomposition and use this to compute effective numerical rank. If 
> someone wants to volunteer to do this, we can leave this in; otherwise I 
> suggest dropping the rank computation from the interface.
> * Write the User's guide.  Now that the package structure is in place, I 
> suggest that we structure the guide following the subpackages.  I have 
> started on the random and stat sections. I will submit a patch with the 
> overview and all sections stubbed out tomorrow AM. I will work on 
> sections in the following order: random, stat, linear, analysis, 
> special, util, posting patches as I have drafts.  If anyone else wants 
> to start on any of these, please post patches/plans.  One sort of dodgy 
> issue related to this is the "MathML dilemma".  So far, I have been able 
> to avoid the need for real math notation by using extra words, etc; but 
> this is going to be a pain.  I would like to either just use MathML 
> (forcing users to download the free plugin if their browser does not 
> support it) or use LaTex and the maven tex plugin (have not tried this, 
> but it is advertised) or just generate pdf from the LaTex.  In either 
> case, I think that it is essential that we keep the source text in plain 
>  ASCII in CVS.  The MathML approach is best, IMHO, because it just 
> amounts to additional markup in the xdocs.
> * Rootfinding framework.  We need to get this rectified or just decide 
> to stay with the simple solution in place now. My vote would be to 
> rectify J's framework.
> * Interpolation.  Al is working on cubic spline interpolation. Right?
> * Extend distribution framework to support discrete distributions and 
> implement binomial and hypergeometric distributions.  Volunteers?  I 
> will do this if no one else wants to; but I will wait until the end of 
> next week to start.
> * Continued work on javadoc, checkstyle, and test coverage.  We need to 
> look at test data coverage as well as path coverage.  In some cases, we 
> have good path coverage, but we have not tested all of the data boundary 
> conditions.  A good example is the problem that Al pointed out re: 
> RealMatrixImpl dimension verification.
> * Additional performance and accuracy testing. If anyone is interested 
> in helping out here, what we could really use is a wider selection of 
> test cases for the core numerical functions and validation against 
> either other packages (e.g. R for the statistical stuff), verified 
> datasets, or experiments comparing implementions using floats to doubles.
> * Additional code review. I am planning to review the current state of 
> all of the code to verify that the code matches the documentation and to 
> identify obvious inefficiencies or numerical problems.  It would be a 
> good idea for others to do the same. All feedback/suggestions for 
> improvement are welcome -- especially if accompanied by patches :-)
> * Finalize the contents of MathUtils and StatUtils. Now would be a good 
> time to suggest any additions -- again, ideally with patches --  to 
> these utility classes.


Two more small items that I forgot to add to the list above:

* Add confidence intervals for the mean.  Originally, I proposed 
nonparametric bootstrap confidence intervals only for the 
StoredUnivariates; but I now think that it would be better to include 
t-based confidence intervals in Univariate.  The interface and 
implementation could be similar to what is implemented as 
getSlopeConfidenceInterval() in RealMatrix -- getMeanConfidenceInterval 
returns the half-width of 95% t-based confidence interval for the mean, 
getMeanConfidenceInterval(alpha) returns the 100(1-alpha)% interval 
half-width.  This can be easily implemented using the distribution 
framework (cf the RealMatrixImpl example and the t-test methods recently 
added to TestStatisticImpl). Bootstrap confidence intervals could also 
be added to StoredUnivariate, using the sampling methods in 
RandomDataImpl, if someone wants to do this. EmpiricalDistributionImpl 
could also be extended to support generation of bootstrap confidence 
intervals based on the distribution digest without storing the full set 
of values.

* Add double[] |-> double methods in StatUtils to take start and end 
array indexes as parameters and delegate the current "full array" 
versions to these.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message