hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marco Nicosia (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-685) DataNode appears to require DNS name resolution as opposed to direct ip mapping
Date Wed, 07 Feb 2007 01:33:06 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12470790

Marco Nicosia commented on HADOOP-685:

I don't think a dataNode should ever try to determine its own hostname.

In situations where dataNodes might have virtual IP addresses configured, or have multiple
interfaces on different subnets, determining what the "correct" hostname should be is non-deterministic.
You can do some work to find the "administrative hostname" (ie, the name of the host, not
necessarily any particular interface) but that's only useful for identification purposes,
and requires DNS to get the FQDN.

I know it's not trivial, but I'd prefer that the nameNode record the IP address of a connection.
That way there's no DNS involved at any level in the transaction, and we know exactly which
interface/IP address is being used. Additionally, there's no worrying about /etc/hosts, or
dhcp, or whatnot. It works for the entire time the dataNode's up, and making network connections.

In order to support multiple dataNodes per machine, dataNodes need to report their listening
port, but I think that's required regardless of how we solve this problem?

> DataNode appears to require DNS name resolution as opposed to direct ip mapping
> -------------------------------------------------------------------------------
>                 Key: HADOOP-685
>                 URL: https://issues.apache.org/jira/browse/HADOOP-685
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>         Environment: osx, ubuntu 6.10b
>            Reporter: James Todd
>         Assigned To: Raghu Angadi
>            Priority: Minor
> DataNode appears to require DNS resolution of nodes via the class org.apache.hadoop.net.DNS
as opposed being able to use a specified ip.
> as an example, i was not able to set up more then one instance of dfs datanodes on one
box using loopback w/ varying ports since DataNode
> resolved the ip of to be "foo.bar" which was then mapped to the dhcp allocated
ip of 192.168.0.***, which was not addressable by the
> rest of the dfs cluster (namely namenode).
> while this example is trivial one should be able to use the very same process yet change
only the ip's of the nodes and have things work as
> expected.
> it would be nice to not always require nds resolution.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message