hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun C Murthy <...@hortonworks.com>
Subject Re: Performance of mappers
Date Fri, 05 Aug 2011 20:51:20 GMT

On Aug 5, 2011, at 12:31 PM, Iman E wrote:
> 
> 
> The task tracker logs does not show any problem. These are the log entries for a task
attempt that is too slow
> 2011-08-05 14:28:01,644 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask):
attempt_201108041814_0035_m_000000_0 task's state:UNASSIGNED
> 2011-08-05 14:28:01,644 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch :
attempt_201108041814_0035_m_000000_0
> 2011-08-05 14:28:01,644 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current
free slots : 2 and trying to launch attempt_201108041814_0035_m_000000_0
> 2011-08-05 14:28:03,097 INFO org.apache.hadoop.mapred.TaskTracker: JVM with ID: jvm_201108041814_0035_m_1371719584
given task: attempt_201108041814_0035_m_000000_0

Are these all the logs in the TT for that duration or just those relevant to the task.

IAC, you should first try with -Djava.net.preferIPv4Stack=true. I've seen cases where this
causes the child map (or reduce) task to spend a long time trying to establish connection
to the TaskTracker on 127.0.0.1.

If that doesn't help, you should move to 0.20.203 which has another set of fixes for task
launches. Moving to the current stable release is a good idea nevertheless! :)

Arun

> 2011-08-05 14:32:52,341 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201108041814_0035_m_000000_0
0.07052739% 
> 2011-08-05 14:32:55,398 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201108041814_0035_m_000000_0
0.123025686% 
> 2011-08-05 14:32:58,402 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201108041814_0035_m_000000_0
0.16794641% 
> 2011-08-05 14:33:01,419 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201108041814_0035_m_000000_0
0.41990894% 
> 2011-08-05 14:33:04,804 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201108041814_0035_m_000000_0
0.8607056% 
> 2011-08-05 14:33:06,617 INFO org.apache.hadoop.mapred.TaskTracker: attempt_201108041814_0035_m_000000_0
1.0% 
> 2011-08-05 14:33:06,625 INFO org.apache.hadoop.mapred.TaskTracker: Task attempt_201108041814_0035_m_000000_0
is done.
> 
> Thanks
> Iman
> 
>  
> 
> From: Arun C Murthy <acm@hortonworks.com>
> To: mapreduce-user@hadoop.apache.org; Iman E <hadoop_ami@yahoo.com>
> Sent: Friday, August 5, 2011 2:05 PM
> Subject: Re: Performance of mappers
> 
> Which release of Hadoop are you running?
> 
> What do the logs on the TaskTracker tell you during the time the slow tasks are getting
launched?
> 
> hadoop-0.20.203 has a ton of bug fixes since hadoop-0.20.2 which help fix issues with
slow launches - you might want to upgrade.
> 
> Arun
> 
> On Aug 5, 2011, at 11:02 AM, Iman E wrote:
> 
>> Hello all,
>> I have a question regarding the mappers. I can see from the logs that the start time
of the mapper is different from start time of logging. I am having a problem because that
time difference sometimes is few seconds, but other times it is
>>  
>> For example, one mapper that is supposed to read 65 MB. Its start time is 12:30:53
whereis the logging start time is 12:33:01 and the end time is 12:33:20. All the laoded data
are local to the same rack.
>> In a perfect run, these numbers are as follows: the start time is 18:15:45, logging
start time: 18:15:48, and end time: 18:16:02.
>>  
>>  
>> I am running a job of more than 2400 mapper. Because of the above problem, instead
of the job taking 15-20 mins  on 14 machine ( it happened in few runs), other times it is
taking more than 70 mins. Any suggestions how to fix this problem or what could possibly be
causing it.
>>  
>> Thanks,
>> Iman
> 
> 
> 


Mime
View raw message