hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun C Murthy <ar...@yahoo-inc.com>
Subject Re: Reduce Performance
Date Mon, 20 Aug 2007 04:17:33 GMT
On Sun, Aug 19, 2007 at 11:33:35PM +0200, Thorsten Schuett wrote:
>I have been looking into the LocalJobRunner today. Is there a chance for
>official support for parallel map execution/>1 reduce tasks or should I look
>into adding it to my local copy of the code?
>

Please file a request (jira), and patch if you are so inclined! There is nothing *official*
about anything here... make yourself at home.

Usually there isn't much bang per buck trying to optimize single-node performance of hadoop's
map-reduce, but any contribution is always welcome.

Arun

>Thorsten
>
>On 8/19/07, Thorsten Schuett <schuett@gmail.com> wrote:
>>
>> In my case, it looks as if the loopback device is the bottleneck. So
>> increasing the number of tasks won't help.
>>
>> Thorsten
>>
>> On 8/18/07, Ted Dunning <tdunning@veoh.com> wrote:
>> >
>> >
>> >
>> > You might try increasing the number of map and reduce tasks so that you
>> > can
>> > overlap cpu and I/O.  It is common in parallel applications that you
>> > need to
>> > do something like this.
>> >
>> >
>> > On 8/18/07 8:36 AM, "Thorsten Schuett" <schuett@gmail.com > wrote:
>> > >> If my assumptions are correct, would it be possible to
>> > >>> read/access the files directly in the "one-node mode"?
>> > >>
>> > >> Please take a look at LocalJobRunner in src/org/apache/hadoop/mapred
>> > ...
>> > >> set the jobtracker in your config to 'local' and this happens
>> > automatically.
>> > >> (http://wiki.apache.org/lucene-hadoop/HowToDebugMapReducePrograms )
>> > >
>> > >
>> > > When I use "local", I loose the web interface and the multi-threading.
>> > I can
>> > > live with the former, but the latter is not an option.
>> > >
>> > > Thorsten
>> >
>> >
>>

Mime
View raw message