hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Renato Moutinho <renato.mouti...@gmail.com>
Subject Re: Reduce phase of wordcount
Date Sun, 05 Oct 2014 14:53:12 GMT
Hi there,

     thanks a lot for taking the time to answer me ! Actually, this "issue"
happens after all the map tasks have completed (I'm looking at the web
interface). I'll try to diagnose if it's an issue with the number of
threads.. I suppose I'll have to change the logging configuration to find
what's going on..

The only that's getting to me is the fact that the lines are repeated on
the log..

Regards,

Renato Moutinho



Em 05/10/2014, às 10:52, java8964 <java8964@hotmail.com> escreveu:

 Don't be confused by 6.03 MB/s.

The relationship between mapper and reducer is M to N relationship, which
means the mapper could send its data to all reducers, and one reducer could
receive its input from all mappers.

There could be a lot of reasons why you think the reduce copying phase is
too slow. It could be the mappers are still running, there is no data
generated for reducer to copy yet; or there is no enough threads in either
mapper or reducer to utilize remaining cpu/memory/network bandwidth. You
can google the hadoop configurations to adjust them.

But just because you can get 60M/s in scp, then complain only getting 6M/s
in the log is not fair to hadoop. You one reducer needs to copy data from
all the mappers, concurrently, makes it impossible to reach the same speed
as one to one point network transfer speed.

The reducer stage is normally longer than map stage, as data HAS to be
transferred through network.

But in word count example, the data needs to be transferred should be very
small. You can ask the following question by yourself:

1) Should I use combiner in this case? (Yes, for word count, it reduces the
data needs to be transferred).
2) Do I use all the reducers I can use, if my cluster is under utilized and
I want my job to finish fast?
3) Can I add more threads in the task tracker to help? You need to dig into
your log to find out if your mapper or reducer are waiting for the thread
from thread pool.

Yong

------------------------------
Date: Fri, 3 Oct 2014 18:40:16 -0300
Subject: Reduce phase of wordcount
From: renato.moutinho@gmail.com
To: user@hadoop.apache.org

Hi people,

    I´m doing some experiments with hadoop 1.2.1 running the wordcount
sample on an 8 nodes cluster (master + 7 slaves). Tuning the tasks
configuration I´ve been able to make the map phase run on 22 minutes..
However the reduce phase (which consists of a single job) stucks at some
points making the whole job take more than 40 minutes. Looking at the logs,
I´ve seen several lines stuck at copy on different moments, like this:

2014-10-03 18:26:34,717 INFO org.apache.hadoop.mapred.TaskTracker:
attempt_201408281149_0019_r_000000_0 0.3302721% reduce > copy (971 of 980
at 6.03 MB/s) >
2014-10-03 18:26:37,736 INFO org.apache.hadoop.mapred.TaskTracker:
attempt_201408281149_0019_r_000000_0 0.3302721% reduce > copy (971 of 980
at 6.03 MB/s) >
2014-10-03 18:26:40,754 INFO org.apache.hadoop.mapred.TaskTracker:
attempt_201408281149_0019_r_000000_0 0.3302721% reduce > copy (971 of 980
at 6.03 MB/s) >
2014-10-03 18:26:43,772 INFO org.apache.hadoop.mapred.TaskTracker:
attempt_201408281149_0019_r_000000_0 0.3302721% reduce > copy (971 of 980
at 6.03 MB/s) >

Eventually the job end, but this information, being repeated, makes me
think it´s having difficulty transferring the parts from the map nodes. Is
my interpretation correct on this ? The trasnfer rate is waaay too slow if
compared to scp file transfer between the hosts (10 times slower). Any
takes on why ?

Regards,

Renato Moutinho

Mime
View raw message