hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo (Nicholas), SZE (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3171) The DatanodeID "name" field is overloaded
Date Thu, 03 May 2012 18:02:50 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13267636#comment-13267636

Tsz Wo (Nicholas), SZE commented on HDFS-3171:

Thanks Eli.  You also want to take a look HDFS-3328.
> The DatanodeID "name" field is overloaded 
> ------------------------------------------
>                 Key: HDFS-3171
>                 URL: https://issues.apache.org/jira/browse/HDFS-3171
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 2.0.0
>         Attachments: hdfs-3171.txt
> The DatanodeID "name" field is currently overloaded, when the DN creates a DatanodeID
to register with the NN it sets "name" to be the datanode hostname, which is the DN's "hostName"
member. This isnot necesarily a FQDN, it is either set explicitly or determined by the DNS
class, which could return the machine's hostname or the result of a DNS lookup, if configured
to do so. The NN then clobbers the "name" field of the DatanodeID with the IP part of the
new DatanodeID "name" field it creates (and sets the DatanodeID "hostName" field to the reported
"name"). The DN gets the DatanodeID back from the NN and clobbers its "hostName" member with
the "name" field of the returned DatanodeID. This makes the code hard to reason about eg DN#getMachine
name sometimes returns a hostname and sometimes not, depending on when it's called in sequence
with the registration. Ditto for uses of the "name" field. I think these contortions were
originally performed because the DatanodeID didn't have a hostName field (it was part of DatanodeInfo)
and so there was no way to communicate both at the same time. Now that the hostName field
is in DatanodeID (as of HDFS-3164) we can establish the invariant that the "name" field always
and only has an IP address and the "hostName" field always and only has a hostname.
> In HDFS-3144 I'm going to rename the "name" field so its clear that it contains an IP
address. The above is enough scope for one change.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message