hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brice Arnould <br...@vleu.net>
Subject Re: Parallelism of sorts
Date Wed, 07 May 2008 07:37:27 GMT


On Mon, 05 May 2008 11:29:00 -0700, Doug Cutting <cutting@apache.org>
wrote:
> Brice Arnould wrote:
>> I was asking myself if it could be a good idea to parallelize some of
> the
>> alogorithms of Hadoop, such as MergeSorter, for the case a single job of
>> run on a multicore system.
> 
> One can already exploit parallelism on a multicore system by using
> "pseudo-distributed" mode and increasing
> mapred.tasktracker.map.tasks.maximum and
> mapred.tasktracker.reduce.tasks.maximum.

> LocalRunner should also someday be enhanced to run multiple maps and
> reduces in separate threads, which would be more efficient, since
> intermediate data would not need to travel through the loopback network
> interface.  But I don't see an urgent case for making the sort code
> itself multi-threaded, since MapReduce itself performs parallel sorting.
Sorry, I really had  misunderstood the way it works. Thanks for your
explanations, I'm going to look at LocalJobRunner.

Brice



Mime
View raw message