hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jian He (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1071) ResourceManager's decommissioned and lost node count is 0 after restart
Date Fri, 14 Feb 2014 22:53:19 GMT

    [ https://issues.apache.org/jira/browse/YARN-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13902075#comment-13902075
] 

Jian He commented on YARN-1071:
-------------------------------

I found that the decommissioned nodes in the current implementation include 2 parts: the nodes
missing in include list (if include list not empty) or the nodes listed in excluded list.
We are able to know the decommissioned  nodes as per exclude list upon RM restart by just
counting the hosts in the file, but not able to know the decommissioned nodes as per include
list unless those nodes come to connect.



> ResourceManager's decommissioned and lost node count is 0 after restart
> -----------------------------------------------------------------------
>
>                 Key: YARN-1071
>                 URL: https://issues.apache.org/jira/browse/YARN-1071
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.1.0-beta
>            Reporter: Srimanth Gunturi
>            Assignee: Jian He
>
> I had 6 nodes in a cluster with 2 NMs stopped. Then I put a host into YARN's {{yarn.resourcemanager.nodes.exclude-path}}.
After running {{yarn rmadmin -refreshNodes}}, RM's JMX correctly showed decommissioned node
count:
> {noformat}
> "NumActiveNMs" : 3,
> "NumDecommissionedNMs" : 1,
> "NumLostNMs" : 2,
> "NumUnhealthyNMs" : 0,
> "NumRebootedNMs" : 0
> {noformat}
> After restarting RM, the counts were shown as below in JMX.
> {noformat}
> "NumActiveNMs" : 3,
> "NumDecommissionedNMs" : 0,
> "NumLostNMs" : 0,
> "NumUnhealthyNMs" : 0,
> "NumRebootedNMs" : 0
> {noformat}
> Notice that the lost and decommissioned NM counts are both 0.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message