hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-376) Apps that have completed can appear as RUNNING on the NM UI
Date Fri, 22 Feb 2013 20:38:13 GMT

     [ https://issues.apache.org/jira/browse/YARN-376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jason Lowe updated YARN-376:
----------------------------

    Priority: Blocker  (was: Major)

Increasing to Blocker as this race can lead to lost logs since NM will not aggregate the logs
until it thinks the application has completed.  In addition each "leaked" application in the
NM has a corresponding log aggregation thread in the NM and eventually it will be unable to
create new threads.
                
> Apps that have completed can appear as RUNNING on the NM UI
> -----------------------------------------------------------
>
>                 Key: YARN-376
>                 URL: https://issues.apache.org/jira/browse/YARN-376
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.0.3-alpha, 0.23.6
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>            Priority: Blocker
>         Attachments: YARN-376.patch
>
>
> On a busy cluster we've noticed a growing number of applications appear as RUNNING on
a nodemanager web pages but the applications have long since finished.  Looking at the NM
logs, it appears the RM never told the nodemanager that the application had finished.  This
is also reflected in a jstack of the NM process, since many more log aggregation threads are
running then one would expect from the number of actively running applications.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message