hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ulul <had...@ulul.org>
Subject Re: Reduce phase of wordcount
Date Sun, 05 Oct 2014 21:53:02 GMT
Hi

You indicate that you have just one reducer, which is the default in 
Hadoop 1 but quite insufficient for a 7 slave nodes cluster.
You should increase mapred.reduce.tasks use combiners and maybe tune 
mapred.reduce.tasktracker.reduce.tasks.maximum

Hope that helps
Ulul

Le 05/10/2014 16:53, Renato Moutinho a écrit :
> Hi there,
>
>      thanks a lot for taking the time to answer me ! Actually, this 
> "issue" happens after all the map tasks have completed (I'm looking at 
> the web interface). I'll try to diagnose if it's an issue with the 
> number of threads.. I suppose I'll have to change the logging 
> configuration to find what's going on..
>
> The only that's getting to me is the fact that the lines are repeated 
> on the log..
>
> Regards,
>
> Renato Moutinho
>
>
>
> Em 05/10/2014, às 10:52, java8964 <java8964@hotmail.com 
> <mailto:java8964@hotmail.com>> escreveu:
>
>> Don't be confused by 6.03 MB/s.
>>
>> The relationship between mapper and reducer is M to N relationship, 
>> which means the mapper could send its data to all reducers, and one 
>> reducer could receive its input from all mappers.
>>
>> There could be a lot of reasons why you think the reduce copying 
>> phase is too slow. It could be the mappers are still running, there 
>> is no data generated for reducer to copy yet; or there is no enough 
>> threads in either mapper or reducer to utilize remaining 
>> cpu/memory/network bandwidth. You can google the hadoop 
>> configurations to adjust them.
>>
>> But just because you can get 60M/s in scp, then complain only getting 
>> 6M/s in the log is not fair to hadoop. You one reducer needs to copy 
>> data from all the mappers, concurrently, makes it impossible to reach 
>> the same speed as one to one point network transfer speed.
>>
>> The reducer stage is normally longer than map stage, as data HAS to 
>> be transferred through network.
>>
>> But in word count example, the data needs to be transferred should be 
>> very small. You can ask the following question by yourself:
>>
>> 1) Should I use combiner in this case? (Yes, for word count, it 
>> reduces the data needs to be transferred).
>> 2) Do I use all the reducers I can use, if my cluster is under 
>> utilized and I want my job to finish fast?
>> 3) Can I add more threads in the task tracker to help? You need to 
>> dig into your log to find out if your mapper or reducer are waiting 
>> for the thread from thread pool.
>>
>> Yong
>>
>> ------------------------------------------------------------------------
>> Date: Fri, 3 Oct 2014 18:40:16 -0300
>> Subject: Reduce phase of wordcount
>> From: renato.moutinho@gmail.com <mailto:renato.moutinho@gmail.com>
>> To: user@hadoop.apache.org <mailto:user@hadoop.apache.org>
>>
>> Hi people,
>>
>>     I´m doing some experiments with hadoop 1.2.1 running the 
>> wordcount sample on an 8 nodes cluster (master + 7 slaves). Tuning 
>> the tasks configuration I´ve been able to make the map phase run on 
>> 22 minutes.. However the reduce phase (which consists of a single 
>> job) stucks at some points making the whole job take more than 40 
>> minutes. Looking at the logs, I´ve seen several lines stuck at copy 
>> on different moments, like this:
>>
>> 2014-10-03 18:26:34,717 INFO org.apache.hadoop.mapred.TaskTracker: 
>> attempt_201408281149_0019_r_000000_0 0.3302721% reduce > copy (971 of 
>> 980 at 6.03 MB/s) >
>> 2014-10-03 18:26:37,736 INFO org.apache.hadoop.mapred.TaskTracker: 
>> attempt_201408281149_0019_r_000000_0 0.3302721% reduce > copy (971 of 
>> 980 at 6.03 MB/s) >
>> 2014-10-03 18:26:40,754 INFO org.apache.hadoop.mapred.TaskTracker: 
>> attempt_201408281149_0019_r_000000_0 0.3302721% reduce > copy (971 of 
>> 980 at 6.03 MB/s) >
>> 2014-10-03 18:26:43,772 INFO org.apache.hadoop.mapred.TaskTracker: 
>> attempt_201408281149_0019_r_000000_0 0.3302721% reduce > copy (971 of 
>> 980 at 6.03 MB/s) >
>>
>> Eventually the job end, but this information, being repeated, makes 
>> me think it´s having difficulty transferring the parts from the map 
>> nodes. Is my interpretation correct on this ? The trasnfer rate is 
>> waaay too slow if compared to scp file transfer between the hosts (10 
>> times slower). Any takes on why ?
>>
>> Regards,
>>
>> Renato Moutinho


Mime
View raw message