hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Reducers fail without messages on 20.205.0
Date Mon, 26 Dec 2011 20:14:58 GMT
Exit code 137 would mean a SIGKILL. Are you positive its reported as a failed task and not
as a killed task? If not, it may just be a result of speculative reducer execution at work
and is nothing to worry about.

On 27-Dec-2011, at 1:14 AM, Markus Jelsma wrote:

> Hi,
> 
> We sometimes see reducers fail just when all mappers are finishing. All 
> mappers finish roughly at the same time. The reducers only dump the following 
> exception:
> 
> java.lang.Throwable: Child Error
> 	at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
> Caused by: java.io.IOException: Task process exit with nonzero status of 137.
> 	at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
> 
> The reducers own log output also shows nothing that gives a clue, this is the 
> last part of the log:
> 
> 2011-12-26 19:35:19,116 INFO org.apache.hadoop.io.compress.CodecPool: Got 
> brand-new decompressor
> 2011-12-26 19:35:19,117 INFO org.apache.hadoop.io.compress.CodecPool: Got 
> brand-new decompressor
> 2011-12-26 19:35:19,117 INFO org.apache.hadoop.io.compress.CodecPool: Got 
> brand-new decompressor
> 2011-12-26 19:35:19,120 INFO org.apache.hadoop.mapred.ReduceTask: 
> attempt_201112261420_0006_r_000009_0 Thread started: Thread for merging on-
> disk files
> 2011-12-26 19:35:19,120 INFO org.apache.hadoop.mapred.ReduceTask: 
> attempt_201112261420_0006_r_000009_0 Thread waiting: Thread for merging on-
> disk files
> 2011-12-26 19:35:19,121 INFO org.apache.hadoop.mapred.ReduceTask: 
> attempt_201112261420_0006_r_000009_0 Thread started: Thread for merging in 
> memory files
> 2011-12-26 19:35:19,122 INFO org.apache.hadoop.mapred.ReduceTask: 
> attempt_201112261420_0006_r_000009_0 Need another 50 map output(s) where 0 is 
> already in progress
> 2011-12-26 19:35:19,122 INFO org.apache.hadoop.mapred.ReduceTask: 
> attempt_201112261420_0006_r_000009_0 Thread started: Thread for polling Map 
> Completion Events
> 2011-12-26 19:35:19,122 INFO org.apache.hadoop.mapred.ReduceTask: 
> attempt_201112261420_0006_r_000009_0 Scheduled 0 outputs (0 slow hosts and0 
> dup hosts)
> 2011-12-26 19:35:24,124 INFO org.apache.hadoop.mapred.ReduceTask: 
> attempt_201112261420_0006_r_000009_0 Scheduled 2 outputs (0 slow hosts and0 
> dup hosts)
> 2011-12-26 19:35:25,805 INFO org.apache.hadoop.mapred.ReduceTask: 
> attempt_201112261420_0006_r_000009_0 Scheduled 1 outputs (0 slow hosts and0 
> dup hosts)
> 2011-12-26 19:36:21,578 INFO org.apache.hadoop.mapred.ReduceTask: 
> attempt_201112261420_0006_r_000009_0 Need another 47 map output(s) where 0 is 
> already in progress
> 2011-12-26 19:36:21,593 INFO org.apache.hadoop.mapred.ReduceTask: 
> attempt_201112261420_0006_r_000009_0 Scheduled 1 outputs (0 slow hosts and0 
> dup hosts)
> 2011-12-26 19:36:42,412 INFO org.apache.hadoop.mapred.ReduceTask: 
> attempt_201112261420_0006_r_000009_0 Scheduled 1 outputs (0 slow hosts and0 
> dup hosts)
> 
> Is there any advice?
> Thanks


Mime
View raw message