hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ming Ma (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-8056) Decommissioned dead nodes should continue to be counted as dead after NN restart
Date Thu, 19 Nov 2015 18:08:11 GMT

     [ https://issues.apache.org/jira/browse/HDFS-8056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ming Ma updated HDFS-8056:
--------------------------
       Resolution: Fixed
     Hadoop Flags: Reviewed
    Fix Version/s: 2.8.0
           Status: Resolved  (was: Patch Available)

Thanks [~andrew.wang]! I have committed the patch to trunk and branch-2.

> Decommissioned dead nodes should continue to be counted as dead after NN restart
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-8056
>                 URL: https://issues.apache.org/jira/browse/HDFS-8056
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Ming Ma
>            Assignee: Ming Ma
>              Labels: BB2015-05-RFC
>             Fix For: 2.8.0
>
>         Attachments: HDFS-8056-2.patch, HDFS-8056.patch
>
>
> We had some offline discussion with [~andrew.wang] and [~cmccabe] about this. Bring this
up for more input and get the patch in place.
> Dead nodes are tracked by {{DatanodeManager}}'s {{datanodeMap}}. However, after NN restarts,
those nodes that were dead before NN restart won't be in {{datanodeMap}}. {{DatanodeManager}}'s
{{getDatanodeListForReport}} will add those dead nodes, but not if they are in the exclude
file.
> {noformat}
>     if (listDeadNodes) {
>       for (InetSocketAddress addr : includedNodes) {
>         if (foundNodes.matchedBy(addr) || excludedNodes.match(addr)) {
>           continue;
>         }
>         // The remaining nodes are ones that are referenced by the hosts
>         // files but that we do not know about, ie that we have never
>         // head from. Eg. an entry that is no longer part of the cluster
>         // or a bogus entry was given in the hosts files
>         //
>         // If the host file entry specified the xferPort, we use that.
>         // Otherwise, we guess that it is the default xfer port.
>         // We can't ask the DataNode what it had configured, because it's
>         // dead.
>         DatanodeDescriptor dn = new DatanodeDescriptor(new DatanodeID(addr
>                 .getAddress().getHostAddress(), addr.getHostName(), "",
>                 addr.getPort() == 0 ? defaultXferPort : addr.getPort(),
>                 defaultInfoPort, defaultInfoSecurePort, defaultIpcPort));
>         setDatanodeDead(dn);
>         nodes.add(dn);
>       }
>     }
> {noformat}
> The issue here is the decommissioned dead node JMX will be different after NN restart.
It might be better to make it consistent across NN restart. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message