hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ZhuGuanyin (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5407) Sometimes, Reduce tasks hang, State is unassigned
Date Thu, 05 Mar 2009 06:37:56 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12679091#action_12679091
] 

ZhuGuanyin commented on HADOOP-5407:
------------------------------------

Today, one map task attempt hang, whose state is unassigned. The following log gives the normal
log and the exception log.

Normal Log:
-------------------
2009-03-05 11:48:40,527 INFO  mapred.TaskTracker (TaskTracker.java:run(314)) - Received KillTaskAction
for task: attempt_200903
032231_0534_m_000785_0
2009-03-05 11:48:40,527 INFO  mapred.TaskTracker (TaskTracker.java:purgeTask(1392)) - About
to purge task: attempt_200903032231
_0534_m_000785_0
2009-03-05 11:48:40,528 INFO  mapred.TaskTracker (TaskTracker.java:addFreeSlot(1625)) - addFreeSlot
: current free slots : 1
2009-03-05 11:48:40,537 WARN  mapred.TaskTracker (TaskTracker.java:reportTaskFinished(2583))
- Unknown child task finshed: atte
mpt_200903032231_0534_m_000785_0. Ignored.


Exception Log
--------------------------
2009-03-05 11:55:51,600 INFO  mapred.TaskTracker (TaskTracker.java:run(314)) - Received KillTaskAction
for task: attempt_200903
032231_0541_m_000046_1
2009-03-05 11:55:51,603 INFO  mapred.TaskTracker (TaskTracker.java:purgeTask(1392)) - About
to purge task: attempt_200903032231
_0541_m_000046_1
2009-03-05 11:55:51,603 INFO  mapred.TaskTracker (TaskTracker.java:reportDone(2022)) - Task
attempt_200903032231_0541_m_000046_
1 is done.
2009-03-05 11:55:51,604 INFO  mapred.TaskTracker (TaskTracker.java:reportDone(2023)) - reported
output size for attempt_2009030
32231_0541_m_000046_1  was 0

> Sometimes, Reduce tasks hang, State is unassigned
> -------------------------------------------------
>
>                 Key: HADOOP-5407
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5407
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: ZhuGuanyin
>
> Hi, all
> When our cluster runs for a long time, some reduce tasks running on some tasktrackers
hang. Their states are UNASSIGNED.  Then, all reduce tasks on these tasktracker will hang.
> We kill the hang reduce task, then the reduce task attempt is re-scheduled to this tasktracker,
the attempt task continues to hang. We fail it, it goes to another tasktracker, it is executed
successfully. 
> Tasktracker which has hang reduce task will receive new reduce task, but the reduce 
task continue to hang for ever.
> When we reboot the tasktracker machine, reduce task no longer hangs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message