hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ming Ma (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-8056) Decommissioned dead nodes should continue to be counted as dead after NN restart
Date Fri, 03 Apr 2015 00:20:53 GMT
Ming Ma created HDFS-8056:
-----------------------------

             Summary: Decommissioned dead nodes should continue to be counted as dead after
NN restart
                 Key: HDFS-8056
                 URL: https://issues.apache.org/jira/browse/HDFS-8056
             Project: Hadoop HDFS
          Issue Type: Improvement
            Reporter: Ming Ma


We had some offline discussion with [~andrew.wang] and [~cmccabe] about this. Bring this up
for more input and get the patch in place.

Dead nodes are tracked by {{DatanodeManager}}'s {{datanodeMap}}. However, after NN restarts,
those nodes that were dead before NN restart won't be in {{datanodeMap}}. {{DatanodeManager}}'s
{{getDatanodeListForReport}} will add those dead nodes, but not if they are in the exclude
file.

{noformat}
    if (listDeadNodes) {
      for (InetSocketAddress addr : includedNodes) {
        if (foundNodes.matchedBy(addr) || excludedNodes.match(addr)) {
          continue;
        }
        // The remaining nodes are ones that are referenced by the hosts
        // files but that we do not know about, ie that we have never
        // head from. Eg. an entry that is no longer part of the cluster
        // or a bogus entry was given in the hosts files
        //
        // If the host file entry specified the xferPort, we use that.
        // Otherwise, we guess that it is the default xfer port.
        // We can't ask the DataNode what it had configured, because it's
        // dead.
        DatanodeDescriptor dn = new DatanodeDescriptor(new DatanodeID(addr
                .getAddress().getHostAddress(), addr.getHostName(), "",
                addr.getPort() == 0 ? defaultXferPort : addr.getPort(),
                defaultInfoPort, defaultInfoSecurePort, defaultIpcPort));
        setDatanodeDead(dn);
        nodes.add(dn);
      }
    }
{noformat}


The issue here is the decommissioned dead node JMX will be different after NN restart. It
might be better to make it consistent across NN restart. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message