commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phil Steitz" <p...@steitz.com>
Subject Re: [math] Greetings from a newcomer (but not to numerics)
Date Sun, 25 May 2003 22:14:54 GMT
Al Chou wrote:
> --- Phil Steitz <phil@steitz.com> wrote:
> 
>>Al Chou wrote:
>>
>>>Greetings to the commons-math community!  I emailed Robert Donkin privately
>>
>>a
>>
>>>few days ago asking, as I take it others have, about the reasoning behind
>>>creating a numerics library from scratch rather than incorporating an
>>
>>existing
>>
>>>one.  I think I understand that reasoning now, but despite not wanting to
>>>dampen anyone's enthusiasm, I do want to caution those who have not done a
>>
>>lot
>>
>>>of numerical computing that it can very easily be done wrong.
>>
>>No question about that.  As I have stated a couple of times, however, I 
>>do not personally see commons-math as a numerics library.  There will 
>>certainly be numerical considerations to worry about, though, and your 
>>point is well taken.
> 
> 
> I'll have to catch up on your posts, Phil, as I don't immediately understand
> how a math library can be not a numerics library.
> 

Of course, what commons-math becomes will be what the contributors 
choose to make it; but I don't think that it *has* to evolve into a 
primarily numerics package or to aspire to a broad numerics scope. The 
random data stuff, for example, is not very numerically intensive, nor 
is most of the basic stats material, nor any combinatorics or other 
discrete math or logic that we might decide to implement. There is no 
question, however, that *some* of the things that we are already doing 
involve numerical methods and we need to be careful about this stuff.

> 
> 
>>  A big reason why
>>
>>>there's a lot of legacy code in scientific computing is that it's hard to
>>
>>get
>>
>>>numerical algorithms right, so once you do, there's great inertia towards
>>>changing anything (there are of course other big reasons, such as the fact
>>
>>that
>>
>>>many computational scientists are not actually as computer-savvy as they
>>
>>are
>>
>>>science-savvy, so they're not very facile at creating new code).
>>
>>Whence the wonderful proliferation of Fortran code in Java ;-)
> 
> 
> Remind me sometime to tell you the story my former supervisor told me about an
> "expert" he once worked with whose personal conception of OOP was completely
> orthogonal to the facilities provided for it in Java, which was the
> implementation language they were working in.
> 
> 
> 
>>>On a more positive note, let me recommend _Numerical Recipes_, _Numerical
>>>Methods that [Usually] Work_ (which incidentally presents an alternate
>>>quadratic formula for use in the regime where the traditional one fails),
>>
>>and
>>
>>>_Real Computing Made REAL_ as easy-to-read and down-to-earth references
>>
>>that
>>
>>>show how easy it is for the naive implementation to be woefully inadequate
>>
>>as
>>
>>>well as teach how to do numerical math correctly.
>>
>>No question that the first of these at least and Knuth are classics to 
>>refer to. I am not familiar with the second two -- thanks for the tip. I 
> 
> 
> In the preface to the first editions of _Numerical Recipes_, the authors
> acknowledge the influence of Forman Acton's _Numerical Methods that [Usually]
> Work_ on their presentation style.  I discovered the first edition of it by
> accident when I was looking in the library for a copy of _NR_, which they were
> all out of at the time, much to my benefit.  It was reissued in 1990, and he
> followed up with _Real Computing..._, essentially a practitioner's coursebook
> in avoiding errors before they occur, in the late '90's.  Amazon has excerpts
> at
> http://www.amazon.com/exec/obidos/tg/detail/-/0691036632/104-7924268-4227944?vi=glance
> 
> 
> 
>>also refer to Burden and Faires and Atkinson's _Numerical Analysis_ 
>>texts.  Do you know of any decent web numerical analysis references?  I 
>>have not been able to find much.  It would be nice to have some of these 
>>as well so we could refer to them in the docs and discussion.  In any 
>>case, it is probably a good idea to add a bibliography section to the 
>>web site.
> 
> 
> Not general references (I confess I haven't read a lot of them -- they're often
> hard reading and not very helpful in terms of actually implementing anything,
> e.g., Bulirsch and Stoer); we should make a point of looking for some. 
> _Numerical Recipes_ is available online at http://www.nr.com/ .  Herbert Wilf
> has made several of his books available for download, though they're
> sufficiently advanced and specialized (though quite readable, from my skimming)
> that one probably wouldn't need to dive into them right away.  The
> sci.math.num-analysis FAQ is at
> http://www.mathcom.com/corpdir/techinfo.mdir/index.html and lists books (not
> necessarily online) at
> http://www.mathcom.com/corpdir/techinfo.mdir/scifaq/q165.html#q165, including
> this review of Acton's first book:
> 
> [Daniel Pick] This book is almost worth its price just for the cathartic
> interlude in the middle of the book on what not to compute.  You should require
> your students to read it, learn it, live it.  You may find just giving them the
> railroad problem found at the beginning of the book a worthwhile exercise.
> [Bill Frensley] Amen, brother! The only complaint that I have about Acton's
> interlude is that after demolishing the notion of "fitting exponential data,"
> he fails to point out that this is the inverse Laplace transform problem. 
> Perhaps if everyone read this and made the connection, we would be spared the
> monthly "is there any good algorithm for the inverse Laplace transform?"
> 
> For the curious, the railroad rail problem is paraphrased by
> http://krietz.org/AC/Spring03/MAT182/Calc1.pdf as follows:
> 
> "This is a famous problem in numerical analysis adapted from Numerical Methods
> That . . . Work, by Forman Acton. You don't need to know any numerical analysis
> to solve it. I give you all required information.
> 
> "A straight railroad rail 1 mile = 5280 feet long is firmly fixed at both ends
> along a flat piece of ground. During the next day, the rail heats up and
> expands by one additional foot, to 5281 feet long. The ends of the rail are
> still fixed at 5280 feet apart (since the ground is not going to stretch the
> same way), so this causes the rail to bow up into an arc of a circle.
> 
> "What is the maximum height of the rail above its former position?"
> 
> 
> 
>>>I'd like to participate in commons-math in an advisory capacity of some
>>
>>sort,
>>
>>>as I can't in good faith commit to being able to contribute code to the
>>>project.  Robert indicated that such a role would be useful to the project,
>>
>>so
>>
>>>I hope you all feel the same!
>>
>>Please, please, please!  Whatever time you have to review 
>>implementations, comment on strategies, and patiently point out our 
>>numerical blunders will be greatly appreciated.
> 
> 
> Thanks!  Having been a computational physicist and not a numerical
> mathematician, I've been a user rather than a producer of numerical algorithms
> (though computational science makes one a producer of programs that always seem
> to require work to make existing code play together correctly), so I'm heavy on
> practice and light on theory, though I audited every numerical computing class
> offered, both undergrad and grad, when I was at UCLA.  So, I won't be of much
> use proving things, but I hope my experience in evaluating algorithms'
> appropriateness for specific problems based on careful reading and thought will
> help out the project.
> 

I am certain that it will. I also have a fair amount of applied 
experience, but most of it in other languages (probably obvious from my 
code -- he he) and a few years back (like my mathematical career). This 
will be interesting.

> 
> 
>>Your frank assessment of project direction and the whole question of 
>>whether or not we should implement the things on the task list
>>(http://jakarta.apache.org/commons/sandbox/math/tasks.html) would also 
>>be appreciated.
>>
>>Phil
>>
>>>Al
>>
> 
> I'll briefly comment on these now, prefacing with my initials [ADC].  I'm not
> at all strong in probability and statistics, so I won't have much to say about
> those components unless I read up fast.
> 
> * Add quantiles (1,5,10,25,50,75,90,95,99) to all Univaraite implementations
> and bootstrap confidence intervals to StoredUnivariate implementations.
> 
> * Add higher order moments to StoredUnivariate (UnivariateImpl if possible).
> 
> * t-test statistic needs to be added and we should probably add the capability
> of actually performing t- and chi-square tests at fixed significance levels
> (.1, .05, .01, .001).
> 
> * numerical approximation of the t- and chi-square distributions to enable
> user-supplied significance levels.
> 
> * The RealMatrixImpl class is missing some key method implementations. The
> critical thing is inversion.  We need to implement a numerically sound
> inversion algorithm.  This will enable solve() and also support general linear
> regression.
> [ADC] Solution of linear systems is essentially never done via matrix
> inversion.  Commonly an LU (lower-upper triangular) decomposition is performed,
> followed by an iteration through the variables, backsolving for each one in
> terms of the preceding ones (the first one being trivially solved as x_i = v_i
> via the LU decomposition, where v_i is a known value).  However, for those
> (rare, if I remember correctly) occasions when a matrix inverse itself is
> actually needed, there are good and bad ways -- the least efficient being the
> recursive method taught in undergraduate linear algebra.
> 
> * ComplexNumber interface and implementation.  The only tricky thing here is
> making division numerically sound and what extended value topology to adopt.
> [ADC] I should have something to say here, having had some exposure to special
> function evaluation, but sadly will have to brush up a lot before having
> anything to contribute.
> 
> * Bivariate Regression, corellation.  This could be done with simple formulas
> manipulating arrays and this is probably what we should aim for in an initial
> release.  Down the road, we should use the RealMatrixImpl solve() to support
> general linear regression.
> 
> * Binomial coefficients  An "exact" implementation that is limited to what can
> be stored in a long exists.  This should be  extended to use BigIntegers and
> potentially to support logarithmic representations.
> 
> * Newton's method for finding roots
> [ADC] It would also be good to provide the contextual infrastructure
> (bracketing a root) as well as methods that don't rely on being able to compute
> the derivative of the function.  Someday I would also like to re-create a
> complex equation solver based on inverse quadratic interpolation (sadly, I
> can't remember the original author's name) I once had in my toolkit.

Yes.  One thing that I am struggling with here is how to do something 
simple and useful without function pointers.  How should we think about 
the interface for rootfinding (however we do it) in Java?

> 
> * Exponential growth and decay
> [ADC] Not sure what is supposed to be provided here.

Just simple computations to support mostly financial applications.

> 
> * Polynomial Interpolation (curve fitting)
> [ADC] Rational function interpolation and cubic spline, too.

Yes.  Here again, step 0 is how to think about the interface/domain in Java.

> 
> * Sampling from Collections
> [ADC] Probably brings up the thorny topic of (good) random number generation,
> about which Hamming said, "The generation of random numbers is too important to
> be left to chance."

This is a place where we could spend *lots* of time, recruit some number 
theorists and end up with something marginally better than what ships 
with the JDK. If someone wants to do this, great, but I think we can 
probably assemble some useful stuff just leveraging the JDK generators 
(a la RandomData).  Just writing the relatively trivial code to generate 
a random sub-collection from a collection using the JDK PRNG would 
probably be useful to a lot of applications. It could be, however, that 
this really belongs in commons-collections.

> 
> * Addition of a Arithmetic Geometric Mean
> 
> 
> 
> Al
> 
> =====
> Albert Davidson Chou
> 
>     Get answers to Mac questions at http://www.Mac-Mgrs.org/ .
> 
> __________________________________
> Do you Yahoo!?
> The New Yahoo! Search - Faster. Easier. Bingo.
> http://search.yahoo.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
> 




---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Mime
View raw message