commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Steitz <phil.ste...@gmail.com>
Subject Re: [Math] Moving on or not?
Date Wed, 06 Feb 2013 15:19:47 GMT
On 2/5/13 6:08 AM, Gilles wrote:
> Hi.
>
> In the thread about "static import", Stephen noted that decisions
> on a
> component's evolution are dependent on whether the future of the Java
> language is taken into account, or not.
> A question on the same theme also arose after the presentation of
> Commons
> Math in FOSDEM 2013.
>
> If we assume that efficiency is among the important qualities for
> Commons
> Math, the future is to allow usage of the tools provided by the
> standard
> Java library in order to ease the development of multi-threaded
> algorithms.
>
> Maintaining Java 1.5 source compatibility for the reason that we
> may need
> to support legacy applications will turn out to be self-defeating:
> 1. New users will not consider Commons Math's features that are
> notably
>    apt to parallel processing.
> 2. Current users might at some point simply switch to another
> library if
>    it proves more efficient (because it actually uses
> multi-threading).
> 3. New Java developers will be turned away because they will want
> to use
>    the more convenient features of the language in order to provide
>    potential contributions.
>
> If maintaining 1.5 source compatibility is kept as a requirement, the
> consequence is that Commons Math will _become_ a legacy library.
> In that perspective, implementing/improving algorithms for which a
> parallel version is known to be more efficient is plainly a waste of
> development and maintenance time.
>
> In order to mitigate the risks (both of upgrading and of not
> upgrading
> the source compatibility requirement), I would propose to create a
> new
> project (say, "Commons Math MT") where we could implement new
> features[1]
> without being encumbered with the 1.5 requirement.[2]
> The "Commons Math MT" would depend on "Commons Math" where we would
> continue developing single-thread (and thread-safe) "tasks", i.e.
> independent units of processing that could be used in algorithms
> located in "Commons Math MT".
>
> In summary:
> - Commons Math (as usual):
>   * single-thread (sequential) algorithms,
>   * (pure) Java 5,
>   * no dependencies.
> - Commons Math MT:
>   * multi-thread (parallel) algorithms,
>   * Java 7 and beyond,
>   * JNI allowed,
>   * dependencies allowed (jCuda).
>
> What do you think?

There are several other possibilities to consider:

0) Implement multithreading using JDK 1.5 primitives
1) Set things up within [math] to support parallel execution in JDK
1.7, Hadoop or other frameworks
2) Instead of a new project, start a 4.x branch targeting JDK 1.7

I think we should maintain a version that has no dependencies and no
JNI in any case.

Starting a branch and getting concrete about how to parallelize some
algorithms would be a good way to start.  One thing I have not
really investigated and would be interested in details on is what
you actually get in efficiency gain (or loss?) using fork / join vs
just using 1.5+ concurrency for the kinds of problems we would end
up using this stuff for.

Thinking about specific parallelization problem instances would also
help decide whether 1) makes sense (i.e., whether it makes sense as
you mention above to maintain a single-threaded library that
provides task execution for a multithreaded version or multithreaded
frameworks).

One more thing to consider is that for at least some users of
[math], having the library internally spawn threads and/or peg
multiple processors may not be desirable.  It is a little misleading
to say that multithreading is the way to get "efficiency."  It is
really the way to *use* more compute resources and unless there are
real algorithmic improvements, the overall efficiency may  actually
be less, due to task coordination overhead.  What you get is faster
execution due to more greedy utilization of available cores.  Actual
efficiency (how much overall compute resource it takes to complete a
job) partly depends on how efficiently the coordination itself is
done (which JDK 1.7 claims to do very well - I have just not seen
substantiation or any benchmarks demonstrating this) and how the
parallelization effects overall compute requirements.  In any case,
for environments where library thread-spawning is not desirable, I
think we should maintain a single-threaded version.

Phil
>
>
> Best regards,
> Gilles
>
> [1] Also, we would gradually move there the algorithms that would
> obviously
>     benefit from a multi-thread implementation (e.g Fourier
> transform,
>     genetic algorithms, etc.)
> [2] This project would also be a place where people could
> experiment with
>     "jCuda" (http://www.jcuda.org).
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message