hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5330) Zombie tasks remain after jobs finish/fail/get killed
Date Fri, 27 Mar 2009 07:08:51 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689834#action_12689834
] 

Devaraj Das commented on HADOOP-5330:
-------------------------------------

Do you know what happened to the jobs to which these attempts belonged? Also, if the task
logs (syslog,stderr,stdout) still exist, what do they say? One suspicion is that the tasks
got OutOfMemory errors and they got hung somewhere in the process of exiting (like in the
task's finally block). The TaskTrackers marked those tasks as FAILED due to lack of ping messages
from the tasks. 

> Zombie tasks remain after jobs finish/fail/get killed
> -----------------------------------------------------
>
>                 Key: HADOOP-5330
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5330
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.19.1
>            Reporter: Nathan Marz
>
> I'm seeing a lot of "task attempts" around our hadoop cluster for jobs that are no longer
around. The attempts seem to be "hung", as they sit there forever. Additionally, they seem
to take up map and reduce slots in the cluster unless MapReduce is restarted. This causes
real jobs to be unable to utilize the whole cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message