hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raghu Angadi (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-985) Namenode should identify DataNodes as ip:port instead of hostname:port
Date Fri, 09 Feb 2007 21:40:06 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Raghu Angadi updated HADOOP-985:

    Attachment: dfshealth.html

With this fix, what we displace on dfs front page changes. The href for datanode now will
have ip address. See attached dfshealth.html. Following comment in dfshealth.jsp describes
what we display:

    /* Say the datanode is dn1.hadoop.apache.org with ip 
       we use: 
       1) d.getHostName():d.getPort() to display. 
           Domain and port are stipped if they are common across the nodes. 
           i.e. "dn1" 
       2) d.getHostName():d.Port() for "title". 
          i.e. "dn1.hadoop.apache.org:50010" 
       3) d.getHost():d.getInfoPort() for url. 
          i.e. "" 
          Note that "d.getHost():d.getPort()" is what DFS clients use 
          to interact with datanodes. 

Yes, the datanode hrefs don't looks good. But one advantage is that we can easily see what
namenode and clients see.

> Namenode should identify DataNodes as ip:port instead of hostname:port
> ----------------------------------------------------------------------
>                 Key: HADOOP-985
>                 URL: https://issues.apache.org/jira/browse/HADOOP-985
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.11.0
>            Reporter: Raghu Angadi
>         Assigned To: Raghu Angadi
>             Fix For: 0.12.0
>         Attachments: dfshealth.html
> Right now NameNode keeps track of DataNodes with "hostname:port". One proposal is to
keep track of datanodes with "ip:port". There are various concerns expressed regd hostnames
and ip. Please add your experiences here so that we have better idea on what we should fix
> How should be calculate datanode ip: 
>             1) Just like how we calculate hostname currently with "dfs.datanode.dns.interface"
and "dfs.datanode.dns.nameserver". So if interface specified wrong, it could report ip like which might or might not be intended.
>             2) Namenode can use the remove socket address when the datanode registers.
Not sure how easy it to get this address in RPC or if this is desirable.
>             3) Namenode could just resolve the hostname when a datanode registers. It
could print of a warning if the resolved ip and reported ip don't match.
> One advantage of using IPs is that DFSClient does not need to resolve them when it connects
to datanode. This could save few milliseconds for each block. Also, DFSClient should check
all its ips to see if a given ip is local or not.
> As far I see namenode does not resolve any DNS in normal operations since it does not
actively contact datanodes. In that sense not sure if this have any change in Namenode performance.
> Thoughts?

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message