hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sandy <snickerdoodl...@gmail.com>
Subject Re: wordcount getting slower with more mappers and reducers?
Date Thu, 05 Mar 2009 18:22:22 GMT
Arun,

How can I check the number of slots per tasktracker? Which parameter
controls that?

Thanks,
-SM

On Thu, Mar 5, 2009 at 12:14 PM, Arun C Murthy <acm@yahoo-inc.com> wrote:

> I assume you have only 2 map and 2 reduce slots per tasktracker - which
> totals to 2 maps/reduces for you cluster. This means with more maps/reduces
> they are serialized to 2 at a time.
>
> Also, the -m is only a hint to the JobTracker, you might see less/more than
> the number of maps you have specified on the command line.
> The -r however is followed faithfully.
>
> Arun
>
>
> On Mar 4, 2009, at 2:46 PM, Sandy wrote:
>
>  Hello all,
>>
>> For the sake of benchmarking, I ran the standard hadoop wordcount example
>> on
>> an input file using 2, 4, and 8 mappers and reducers for my job.
>> In other words,  I do:
>>
>> time -p bin/hadoop jar hadoop-0.18.3-examples.jar wordcount -m 2 -r 2
>> sample.txt output
>> time -p bin/hadoop jar hadoop-0.18.3-examples.jar wordcount -m 4 -r 4
>> sample.txt output2
>> time -p bin/hadoop jar hadoop-0.18.3-examples.jar wordcount -m 8 -r 8
>> sample.txt output3
>>
>> Strangely enough, when this increase in mappers and reducers result in
>> slower running times!
>> -On 2 mappers and reducers it ran for 40 seconds
>> on 4 mappers and reducers it ran for 60 seconds
>> on 8 mappers and reducers it ran for 90 seconds!
>>
>> Please note that the "sample.txt" file is identical in each of these runs.
>>
>> I have the following questions:
>> - Shouldn't wordcount get -faster- with additional mappers and reducers,
>> instead of slower?
>> - If it does get faster for other people, why does it become slower for
>> me?
>>  I am running hadoop on psuedo-distributed mode on a single 64-bit Mac Pro
>> with 2 quad-core processors, 16 GB of RAM and 4 1TB HDs
>>
>> I would greatly appreciate it if someone could explain this behavior to
>> me,
>> and tell me if I'm running this wrong. How can I change my settings (if at
>> all) to get wordcount running faster when i increases that number of maps
>> and reduces?
>>
>> Thanks,
>> -SM
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message