hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From achilles852 <faheemk...@gmail.com>
Subject RE: last map task taking too long
Date Wed, 30 Sep 2009 06:03:12 GMT

Hi
The input is a plain text file. I use the parameters specified in the input
file to launch a process on each machine and then collect the results back.
I am not using cached files. Everything needed is contained in the job jar
file. Each map task is supposed to finish within one minute.  

Here's the output from the reduce phase, where things get stuck:
Running Hadoop in Pseudo-distributed mode.

[code]
2009-09-30 06:27:38,601 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Need another 2 map output(s) where 0 is
already in progress
2009-09-30 06:27:38,603 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0: Got 0 new map-outputs
2009-09-30 06:27:38,603 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Scheduled 0 outputs (0 slow hosts and0
dup hosts)
2009-09-30 06:28:33,623 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0: Got 1 new map-outputs
2009-09-30 06:28:33,624 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Scheduled 1 outputs (0 slow hosts and0
dup hosts)
2009-09-30 06:28:33,628 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling
2 bytes (6 raw bytes) into RAM from attempt_200909292242_0017_m_000007_0
2009-09-30 06:28:33,628 INFO org.apache.hadoop.mapred.ReduceTask: Read 2
bytes from map-output for attempt_200909292242_0017_m_000007_0
2009-09-30 06:28:33,629 INFO org.apache.hadoop.mapred.ReduceTask: Rec #1
from attempt_200909292242_0017_m_000007_0 -> (-1, -1) from pc01
2009-09-30 06:28:40,624 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Need another 1 map output(s) where 0 is
already in progress
2009-09-30 06:28:40,625 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0: Got 0 new map-outputs
2009-09-30 06:28:40,626 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Scheduled 0 outputs (0 slow hosts and0
dup hosts)
2009-09-30 06:29:40,639 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Need another 1 map output(s) where 0 is
already in progress
2009-09-30 06:29:40,640 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0: Got 0 new map-outputs
2009-09-30 06:29:40,641 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Scheduled 0 outputs (0 slow hosts and0
dup hosts)
2009-09-30 06:30:40,655 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Need another 1 map output(s) where 0 is
already in progress
2009-09-30 06:30:40,657 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0: Got 0 new map-outputs
2009-09-30 06:30:40,657 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Scheduled 0 outputs (0 slow hosts and0
dup hosts)
2009-09-30 06:31:40,677 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Need another 1 map output(s) where 0 is
already in progress
2009-09-30 06:31:40,679 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0: Got 0 new map-outputs
2009-09-30 06:31:40,679 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Scheduled 0 outputs (0 slow hosts and0
dup hosts)
2009-09-30 06:32:40,692 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Need another 1 map output(s) where 0 is
already in progress
2009-09-30 06:32:40,693 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0: Got 0 new map-outputs
2009-09-30 06:32:40,694 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Scheduled 0 outputs (0 slow hosts and0
dup hosts)
2009-09-30 06:33:40,708 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Need another 1 map output(s) where 0 is
already in progress
2009-09-30 06:33:40,710 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0: Got 0 new map-outputs
2009-09-30 06:33:40,710 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Scheduled 0 outputs (0 slow hosts and0
dup hosts)
2009-09-30 06:34:40,731 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Need another 1 map output(s) where 0 is
already in progress
2009-09-30 06:34:40,733 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0: Got 0 new map-outputs
2009-09-30 06:34:40,733 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Scheduled 0 outputs (0 slow hosts and0
dup hosts)
2009-09-30 06:35:40,753 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0 Need another 1 map output(s) where 0 is
already in progress
2009-09-30 06:35:40,755 INFO org.apache.hadoop.mapred.ReduceTask:
attempt_200909292242_0017_r_000000_0: Got 1 new map-outputs
[/code]




Amogh Vasekar-2 wrote:
> 
> Hi,
> Can you provide info on the input like compression etc? Also, are you
> using cached files in your map tasks? It might be helpful if you paste the
> logs here after blanking your system specific info., as then one can find
> out where till the reduce it went or if the copy phase started at all.
> 
> Thanks,
> Amogh
> 
> -----Original Message-----
> From: achilles852 [mailto:faheemkhan@gmail.com] 
> Sent: Wednesday, September 30, 2009 6:38 AM
> To: core-dev@hadoop.apache.org
> Subject: Re: last map task taking too long
> 
> 
> Basically, it finishes what it is supposed to do (I view the logs to find
> out), but does not move onto the reduce stage.
> 
> 
> Ted Dunning wrote:
>> 
>> Is that last map task actually running, or is it pending?
>> 
>> On Tue, Sep 29, 2009 at 5:57 PM, achilles852 <faheemkhan@gmail.com>
>> wrote:
>> 
>>>
>>> Hey.. I am trying to write a small mapreduce program. I launch a few map
>>> tasks, each of which should complete within a certain time (say 5
>>> minutes)... all the tasks complete within 5 minutes except the last one
>>> -
>>> which takes around 10 times more the time taken by all other map
>>> tasks.....any idea why this is happening?
>>>
>>> I am using Hadoop version 0.19.2, tried running it locally as well as on
>>> EC2.
>>> --
>>> View this message in context:
>>> http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25673359.html
>>> Sent from the Hadoop core-dev mailing list archive at Nabble.com.
>>>
>>>
>> 
>> 
>> -- 
>> Ted Dunning, CTO
>> DeepDyve
>> 
>> 
> 
> -- 
> View this message in context:
> http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25673431.html
> Sent from the Hadoop core-dev mailing list archive at Nabble.com.
> 
> 
> 
-- 
View this message in context: http://www.nabble.com/last-map-task-taking-too-long-tp25673359p25675439.html
Sent from the Hadoop core-dev mailing list archive at Nabble.com.


Mime
View raw message