hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Junping Du (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4408) NodeManager still reports negative running containers
Date Wed, 02 Dec 2015 19:55:11 GMT

    [ https://issues.apache.org/jira/browse/YARN-4408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036485#comment-15036485
] 

Junping Du commented on YARN-4408:
----------------------------------

Thanks Robert for updating the patch. Can we make log messages here in WARN level given this
is unusual case and our log level is only enabled for INFO or above by default?

> NodeManager still reports negative running containers
> -----------------------------------------------------
>
>                 Key: YARN-4408
>                 URL: https://issues.apache.org/jira/browse/YARN-4408
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.4.0
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: YARN-4408.001.patch, YARN-4408.002.patch
>
>
> YARN-1697 fixed a problem where the NodeManager metrics could report a negative number
of running containers.  However, it missed a rare case where this can still happen.
> YARN-1697 added a flag to indicate if the container was actually launched ({{LOCALIZED}}
to {{RUNNING}}) or not ({{LOCALIZED}} to {{KILLING}}), which is then checked when transitioning
from {{CONTAINER_CLEANEDUP_AFTER_KILL}} to {{DONE}} and {{EXITED_WITH_FAILURE}} to {{DONE}}
to only decrement the gauge if we actually ran the container and incremented the gauge . 
However, this flag is not checked while transitioning from {{EXITED_WITH_SUCCESS}} to {{DONE}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message