commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ole Ersoy <>
Subject Re: [math] threading redux
Date Sat, 18 Apr 2015 03:05:57 GMT
This is a pretty good read as well:

A concern in earlier discussions focused on controlling the number of threads that the job
consumes.  Theres one example of using a custom thread pool for that.  Users could pass the
number of threads as a method argument or constructor parameter (Or just default to Runtime.getRuntime().availableProcessors()).
 At the end there's also an example of using a CompletableFuture, which seems inline with
the Future-like API that James is suggesting.

- Ole

BTW - Personally I find working with concurrency in Java 8 simple and refreshing.  You do
have to make sure that you use thread safe classes inside loops that are parallel, but the
rest is straight forward (Knock on wood :) ).

On 04/17/2015 07:20 PM, Gary Gregory wrote:
> I thought I'd share this read with you guys:
> I'm not sure how closely these problems relate with what [math] is trying
> to do, but it's a interesting read.
> Gary
> On Fri, Apr 17, 2015 at 9:01 AM, Gilles <>
> wrote:
>> On Fri, 17 Apr 2015 08:35:42 -0700, Phil Steitz wrote:
>>> On 4/17/15 3:14 AM, Gilles wrote:
>>>> Hello.
>>>> On Thu, 16 Apr 2015 17:06:21 -0500, James Carman wrote:
>>>>> Consider me poked!
>>>>> So, the Java answer to "how do I run things in multiple threads"
>>>>> is to
>>>>> use an Executor (java.util).  This doesn't necessarily mean that you
>>>>> *have* to use a separate thread (the implementation could execute
>>>>> inline).  However, in order to accommodate the separate thread case,
>>>>> you would need to code to a Future-like API.  Now, I'm not saying to
>>>>> use Executors directly, but I'd provide some abstraction layer above
>>>>> them or in lieu of them, something like:
>>>>> public interface ExecutorThingy {
>>>>>    Future<T> execute(Function<T> fn);
>>>>> }
>>>>> One could imagine implementing different ExecutorThingy
>>>>> implementations which allow you to parallelize things in different
>>>>> ways (simple threads, JMS, Akka, etc, etc.)
>>>> I did not understand what is being suggested: parallelization of a
>>>> single algorithm or concurrent calls to multiple instances of an
>>>> algorithm?
>>> Really both.  It's probably best to look at some concrete examples.
>> Certainly...
>>   The two I mentioned in my apachecon talk are:
>>> 1.  Threads managed by some external process / application gathering
>>> statistics to be aggregated.
>>> 2.  Allowing multiple threads to concurrently execute GA
>>> transformations within the GeneticAlgorithm "evolve" method.
>> I could not view the presentation from the link previously mentioned
>> (it did not work with my browser...).
>> Can I download the PDF file from somewhere?
>>   It would be instructive to think about how to handle both of these
>>> use cases using something like what James is suggesting.  What is
>>> nice about his idea is that it could give us a way to let users /
>>> systems decide whether they want to have [math] algorithms spawn
>>> threads to execute concurrently or to allow an external execution
>>> framework to handle task distribution across threads.
>> Some (all?) cases of "external" parallelism are trivial for the CM
>> developers: the user must chop his data, pass the chunks as arguments
>> to the CM methods, then collect and reassemble the results, all by
>> himself.
>> IIUC the scenario, this cannot be deemed a "feature".
>>   Since 2. above is a good example of "internal" parallelism and it
>>> also has data sharing / transfer challenges, maybe its best to start
>>> with that one.
>> That's the scenario where usage is simple and performance can match
>> the user's machine capability when running CM algorithms that are
>> inherently parallel.
>> There is an example in CM: see
>>    testTravellerSalesmanSquareTourParallelSolver()
>> in
>>   I have just started thinking about this and would
>>> love to get better ideas than my own hacking about how to do it
>>> a) Using Spark with RDD's to maintain population state data
>>> b) Hadoop with HDFS (or something else?)
>> I have zero experience with this but I'm interested to know more. :-)
>> Regards,
>> Gilles
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message