hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Joseph Evans (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-167) AM stuck in KILL_WAIT for days
Date Tue, 23 Oct 2012 21:34:12 GMT

    [ https://issues.apache.org/jira/browse/YARN-167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482716#comment-13482716

Robert Joseph Evans commented on YARN-167:

I am rather nervous about back porting MAPREDUCE-3353.  It is a major feature that has a significant
footprint and was not all that stable when it first went in.  I know that it has since stabilized
but I am still nervous about such a large change. It seems like it would be simpler to handle
the KILL events in the states that missed it.
> AM stuck in KILL_WAIT for days
> ------------------------------
>                 Key: YARN-167
>                 URL: https://issues.apache.org/jira/browse/YARN-167
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 0.23.3
>            Reporter: Ravi Prakash
>            Assignee: Vinod Kumar Vavilapalli
>         Attachments: TaskAttemptStateGraph.jpg
> We found some jobs were stuck in KILL_WAIT for days on end. The RM shows them as RUNNING.
When you go to the AM, it shows it in the KILL_WAIT state, and a few maps running. All these
maps were scheduled on nodes which are now in the RM's Lost nodes list. The running maps are

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message