hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kuhu Shukla (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3102) Decommisioned Nodes not listed in Web UI
Date Thu, 10 Dec 2015 15:30:11 GMT

    [ https://issues.apache.org/jira/browse/YARN-3102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15051121#comment-15051121

Kuhu Shukla commented on YARN-3102:

Following the discussion from YARN-4402:

Given that we consider exclude list as canonical truth of decomm-ed nodes, which means during
serviceInit, {{setDecomissionedNMsMetrics}} call is kept as is, the only way to have these
nodes be part of inactiveNodes map, which today gets reinitialized to a new empty concurrent
map in {{RMActiveServiceContext}} during startup, is to have node hostnames/ips read in from
exclude list and added to this map even though we lose the port information. This is because
the node would ideally not have the NM process running and we don't keep that state across
RM restarts. What that means is, we add a (NodeId,RMNode) entry where the hostname is legal
but the ports are a defined invalid value like -1. This allows us to track the nodes that
were decommissioned in the previous life cycle of the RM. We can also tweak the GUI to display
N/A when the port is -1. Since the check of {{isValidNode}} is only on the basis of hostname/ip
, this does not affect the rejoining behavior of the node. Requesting [~eepayne] for comments
and ideas. 

> Decommisioned Nodes not listed in Web UI
> ----------------------------------------
>                 Key: YARN-3102
>                 URL: https://issues.apache.org/jira/browse/YARN-3102
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.6.0
>         Environment: 2 Node Manager and 1 Resource Manager 
>            Reporter: Bibin A Chundatt
>            Assignee: Kuhu Shukla
>            Priority: Minor
> Configure yarn.resourcemanager.nodes.exclude-path in yarn-site.xml to yarn.exlude file
In RM1 machine
> Add Yarn.exclude with NM1 Host Name 
> Start the node as listed below NM1,NM2 Resource manager
> Now check Nodes decommisioned in /cluster/nodes
> Number of decommisioned node is listed as 1 but Table is empty in /cluster/nodes/decommissioned
(detail of Decommision node not shown)

This message was sent by Atlassian JIRA

View raw message