hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Runping Qi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4646) Reduce stuck in pending state for ever even though the job tracker shows a lot of free slots
Date Thu, 13 Nov 2008 05:03:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647196#action_12647196

Runping Qi commented on HADOOP-4646:

The below is the relevant part from the job tracker:

2008-11-09 05:09:16,215 INFO org.apache.hadoop.mapred.TaskInProgress: Error from task_200811070042_0002_r_000009_0:
java.io.IOException: subprocess exited successfully
R/W/S=115505/3231/0 in:0=115505/188655 [rec/s] out:0=3231/188655 [rec/s]
minRecWrittenToEnableSkip_=9223372036854775807 LOGNAME=null
last Hadoop input: |null|
last tool output: |[B@1d1c428|
Date: Sun Nov 09 05:09:11 UTC 2008
MROutput/MRErrThread failed:java.io.IOException: All datanodes xxxxxxxxx are bad. Aborting...
	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2096)
	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1300(DFSClient.java:1702)
	at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1818)

	at org.apache.hadoop.streaming.PipeReducer.reduce(PipeReducer.java:104)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:391)
	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2122)

2008-11-09 05:09:55,464 INFO org.apache.hadoop.mapred.JobTracker: Removed completed task 'task_200811070042_0002_r_000009_0'
from 'tracker_xxxxxxxx
Note that the last line above saying that task_200811070042_0002_r_000009_0 was completed,
but it was failed actually.

> Reduce stuck in pending state for ever even though the job tracker shows a lot of free
> --------------------------------------------------------------------------------------------
>                 Key: HADOOP-4646
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4646
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.2
>            Reporter: Runping Qi
> A job with 38 mappers and 38 reducers running on a cluster with 36 slots.
> All mapper tasks completed. 17 reducer tasks completed. 11 reducers are still in the
running state
> and one is in the oending state and stay there forever.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message