commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gilles <gil...@harfang.homelinux.org>
Subject Re: [Math] Moving on or not?
Date Wed, 06 Feb 2013 17:03:14 GMT
On Wed, 06 Feb 2013 07:19:47 -0800, Phil Steitz wrote:
> On 2/5/13 6:08 AM, Gilles wrote:
>> Hi.
>>
>> In the thread about "static import", Stephen noted that decisions
>> on a
>> component's evolution are dependent on whether the future of the 
>> Java
>> language is taken into account, or not.
>> A question on the same theme also arose after the presentation of
>> Commons
>> Math in FOSDEM 2013.
>>
>> If we assume that efficiency is among the important qualities for
>> Commons
>> Math, the future is to allow usage of the tools provided by the
>> standard
>> Java library in order to ease the development of multi-threaded
>> algorithms.
>>
>> Maintaining Java 1.5 source compatibility for the reason that we
>> may need
>> to support legacy applications will turn out to be self-defeating:
>> 1. New users will not consider Commons Math's features that are
>> notably
>>    apt to parallel processing.
>> 2. Current users might at some point simply switch to another
>> library if
>>    it proves more efficient (because it actually uses
>> multi-threading).
>> 3. New Java developers will be turned away because they will want
>> to use
>>    the more convenient features of the language in order to provide
>>    potential contributions.
>>
>> If maintaining 1.5 source compatibility is kept as a requirement, 
>> the
>> consequence is that Commons Math will _become_ a legacy library.
>> In that perspective, implementing/improving algorithms for which a
>> parallel version is known to be more efficient is plainly a waste of
>> development and maintenance time.
>>
>> In order to mitigate the risks (both of upgrading and of not
>> upgrading
>> the source compatibility requirement), I would propose to create a
>> new
>> project (say, "Commons Math MT") where we could implement new
>> features[1]
>> without being encumbered with the 1.5 requirement.[2]
>> The "Commons Math MT" would depend on "Commons Math" where we would
>> continue developing single-thread (and thread-safe) "tasks", i.e.
>> independent units of processing that could be used in algorithms
>> located in "Commons Math MT".
>>
>> In summary:
>> - Commons Math (as usual):
>>   * single-thread (sequential) algorithms,
>>   * (pure) Java 5,
>>   * no dependencies.
>> - Commons Math MT:
>>   * multi-thread (parallel) algorithms,
>>   * Java 7 and beyond,
>>   * JNI allowed,
>>   * dependencies allowed (jCuda).
>>
>> What do you think?
>
> There are several other possibilities to consider:
>
> 0) Implement multithreading using JDK 1.5 primitives
> 1) Set things up within [math] to support parallel execution in JDK
> 1.7, Hadoop or other frameworks
> 2) Instead of a new project, start a 4.x branch targeting JDK 1.7
>
> I think we should maintain a version that has no dependencies and no
> JNI in any case.
>
> Starting a branch and getting concrete about how to parallelize some
> algorithms would be a good way to start.  One thing I have not
> really investigated and would be interested in details on is what
> you actually get in efficiency gain (or loss?) using fork / join vs
> just using 1.5+ concurrency for the kinds of problems we would end
> up using this stuff for.
>
> Thinking about specific parallelization problem instances would also
> help decide whether 1) makes sense (i.e., whether it makes sense as
> you mention above to maintain a single-threaded library that
> provides task execution for a multithreaded version or multithreaded
> frameworks).
>
> One more thing to consider is that for at least some users of
> [math], having the library internally spawn threads and/or peg
> multiple processors may not be desirable.  It is a little misleading
> to say that multithreading is the way to get "efficiency."  It is
> really the way to *use* more compute resources and unless there are
> real algorithmic improvements, the overall efficiency may  actually
> be less, due to task coordination overhead.  What you get is faster
> execution due to more greedy utilization of available cores.  Actual
> efficiency (how much overall compute resource it takes to complete a
> job) partly depends on how efficiently the coordination itself is
> done (which JDK 1.7 claims to do very well - I have just not seen
> substantiation or any benchmarks demonstrating this) and how the
> parallelization effects overall compute requirements.  In any case,
> for environments where library thread-spawning is not desirable, I
> think we should maintain a single-threaded version.
>

Unless I missed the point, those reasons are exactly why I propose to
have 2 projects/components. One, "Commons-Math", does not fiddle with
resources, while the other would provide a "parallelizationLevel"
setting for the algorithms written to possibly take advantage of the
Java 5+ "task framework".

Yes, we could still be good by using only Java 5's concurrency features
but the issue I raise is not only about concurrency but about
evolution/progress/maintenance, all things that require raising 
interest
from new contributors (unless it's fine that Commons Math be tagged as 
a
"library of the past"...).

But using concurrency features in "Commons Math" would also contradict
your own point ("we should maintain a single-threaded version"): I 
agree,
and that's why I proposed this other project...

As for efficiency (or faster execution, if you want), I don't see the
point in doubting that tasks like global search (e.g. in a genetic
algorithm) will complete in less time when run in parallel...

As I summarized previously, having a "Commons Math MT" would bring no
inconvenience, contrary to either your points 0, 1, or 2. [No
inconvenience to me, that is, but to people with requirements like
"Java 5 compatible" or "no multi-threading").
As I indicated, the basic "task" could be defined in "Commons Math" and
"Commons Math MT" would provide the parallelization "glue" (e.g. to 
divide
the search space of the GA).


Gilles


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message