 Phil Steitz <phil@steitz.com> wrote:
> > (2) Considerations
> >
> > a.) Is consistent library design important?Can all these models
> > interace effectively? Are all these different design models required? Is
> > there a single design model that can span the entire library?
>
> IMHO, the most important considerations are 1) how easy the library will
> be to navigate and use 2) how maintainable it will be and 3) how well it
> will perform (including resource requirements). For all of these, I
> think that it is best to look at practical application use cases 
> e.g., if someone wants to solve a linear system or estimate a bivariate
> regression model, how easy will it be for them to do that using
> commonsmath? How well will the solution perform and scale? How easy
> will it be for us to extend it? Since the library does different kinds
> of things to satisfy some fundamentally different use cases (e.g.
> generating a random sample vs modelling algebraic operations on a real
> matrix) it is natural that multiple design patterns and implementation
> strategies are used. Trying to force all commonsmath components into a
> single abstract model is not necessary, IMHO, and would likely make the
> library harder to use and maintain.
I agree with Phil. I started to reply to Mark's original message when he
posted it, but I decided to wait, as I felt I was largely basing my opinion on
a possibly inappropriate context (object oriented numerics in C++ discussions
I've followed over the years that largely center on expression templates). I
do feel pretty strongly that unless consolidating the design of the library
makes it easier for end users to use, I wouldn't make it a priority. I do
support examining the architecture and design so that we don't release nonsense
or paint ourselves into a corner out of stupidity or laziness, of course.
> > b.) Which design strategy is of more interest to the project? Small
> > compact classes used to complete one specific task, where the
> > programming is primarily OOD in nature? or Feature rich interfaces that
> > encapsulate full ranges of functionality, where programming is primarily
> > "procedural" in nature?
>
> Here again, my opinion is that this should be determined by the
> practical problem being addressed.
Does the intent of commonsmath not favor the "small classes" approach
somewhat?
> > d.) Should static method utilities be avoided at all costs in both
> > cases? OR are there specific situations were static functions do not
> > create garbage collection considerations and issues (like, when only
> > returning primitives).
>
> I am starting to think that we should avoid static methods, and in fact
> change StatUtils to require instantiation, but this has nothing to do
> with garbage collection, since with a stateless collection of static
> methods, there is no "garbage" to collect  just a class loaded once
> per JVM. As long as there is no state associated with the class, I
> don't see how classloader problems could really come into play (unless
> users were relying on classloader priority to load different versions,
> which is IMHO a bad idea and could apply to any of our classes). The
> real issue here is extensibility. As I think more about the use cases
> for StatUtils, I am less convinced than I was before that the
> "convenience" and "efficiency" of not having to create an instance is
> worth the anxiety about support for extensibility. Therefore, I would
> support changing the methods in StatUtils to be nonstatic.
Does staticness preclude extensibility? I assume finality would, but we
didn't declare any methods final, AFAIK.
> > (3.) A couple proposals:
> >
> > (i.) Brent and Pietschmann can you make suggestions/recommendations as
> > to how your "function object" model could be applied to StaticUtil
> > situations? Are you familiar with the Functors project and is there a
> > possibility that they should be considered as the basic design strategy
> > or base of implementation for your "Function Object" design? if they are
> > a Commons project as well is there a possible dependency here we could
> > take advantage of?
>
> My opinion here is that a univariate real function is an object in its
> own right. I suppose that it could extend Functor, but I do not see the
> point in this and I would personally not like having to lose the typing
> and force use and casting to Double to implement
> Object evaluate(Object obj);
After briefly looking at Functor's Javadocs, I feel that its interface goes in
a somewhat different direction from that which a mathematical API should.
While there are commonalities, it seems to me that some of what Functor does is
more easily and naturally expressed using existing mathematical operators
(reminiscent of our discussion of whether, for instance, an isPositive method
would be worth providing in commonsmath). If we were writing a library to do
abstract algebra, we might need interfaces more like Functor's.
> It should also be noted that this is a relatively trivial part of what
> is really going on in the analysis package (i.e. rootfinding and spline
> fitting).
Good point. A tangent on terminology: I would be more comfortable if we made
sure to refer to what we currently have in the library as interpolation. Curve
fitting is a somewhat different problem from interpolation, most grossly seen
in the fact that a fitted curve need not (and usually will not) pass through
all the input data points, whereas by definition and interpolating curve must.
> Another point that we need to keep in mind is that we have a naturally
> layered structure, which will become even more so over time. We should
> be liberal in exposing technical functionality that "most users" will
> not use and the mathematical orientation of the interfaces will
> naturally increase as you go deeper into the infrastructure. For
> example, Brent did the hard work to derive and implement some special
> functions that reside in the special package. These were the key to
> providing the statistical testing/confidence intervals that "more users"
> may use. "Most users" will not use the special functions directly 
> but it is very nice to have them exposed for the mathematical
> programmers who want to exploit their many uses beyond what we have used
> them internally for. Moving up the layers, "most users" will not use the
> ChiSquare distribution directly (which builds on special); but that is
> also very nice to have. Continuing up the call stack leading to the
> stats tests, we come to rootfinding, which more users will use directly
> and finally to the statistical tests and confidence intervals, which
> will likely be used directly quite a bit by people with no understanding
> or interest in either rootfinding or special functions. At each of the
> layers, a different level of mathematical sophistication is expected and
> different kinds of interfaces are "natural".
Well put. I feel the same way about the level of sophistication assumed as one
goes deeper into the core of the library. A pretty unsophisticated user should
be able to get value out of at least some sizable portion of the library, but I
think necessarily we will in the process of building that portion provide
facilities useful to those with more mathematical knowledge (and as you say, we
already have).
> Finally, I think that it is appropriate to in some cases expose what
> amounts the the "same functionality" via multiple different kinds of
> interfaces. For example, to get the mean of a collection of doubles,
> you can now a) use StatUtils if what you have is an array of doubles and
> all you want is the mean b) instantiate a storageless Univariate and
> feed the values in (good for long lists of values that you don't want to
> store in memory) or c) if the numbers whose mean you want happen to be
> exposed as properties of a collection of beans, instantiate a
> BeanListUnivariate and use it to get the mean. I see absolutely nothing
> wrong with this and in fact a lot that is "right" with this  practical
> use cases drive design and the result is flexibility, convenience and
> ease of use.
Yes, I agree what we have in this example is a good case of providing a very
short distance between "what I have" (e.g., a double[]) and "what I want" (the
mean of the values therein), while making a good effort to avoid unnecessary
duplication in the underlying implementation (we seem to talk about that a lot,
anyway <g>).
Al
=====
Albert Davidson Chou
Get answers to Mac questions at http://www.MacMgrs.org/ .
__________________________________
Do you Yahoo!?
SBC Yahoo! DSL  Now only $29.95 per month!
http://sbc.yahoo.com

To unsubscribe, email: commonsdevunsubscribe@jakarta.apache.org
For additional commands, email: commonsdevhelp@jakarta.apache.org
