hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3146) Datanode should be able to register multiple network interfaces
Date Mon, 09 Apr 2012 00:59:16 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13249638#comment-13249638

Todd Lipcon commented on HDFS-3146:

+  public static InetSocketAddress[] getInterfaceAddrs(
+      String interfaceNames[], int port) throws UnknownHostException {

Sorry I missed this in the earlier review of this function, but I think it would be better
to call the parameter something like {{interfaceSpecs}} -- because each one may specify an
interface name, an IP address, or a subnet.

In the same function, for the subnet case, you're using port 0 instead of the specified port.
Looks like a mistake?


+      LOG.warn("Invalid address given " + addrString);
Nit: add a ':' to the log message

+    // If the datanode registered with an address we can't use
+    // then use the address the IPC came in on instead
+    if (NetUtils.isWildcardOrLoopback(nodeReg.getIpAddr())) {

I found this comment a little unclear. Under what circumstance would the DN pass a loopback
or wildcard IP? Aren't they filtered on the DN side? I think this should be at least a WARN,
or maybe even throw an exception to disallow the registration.

Edit: I got to the part later in the patch where the DN potentially sends a wildcard to the
NN. I think it might be simplier to have the DN send an empty list to the NN if it's bound
to wildcard -- and adjust the comment here to explain why it would be registering with no


+      // TODO: haven't determined the port yet, using default
Are you planning another patch to fix this on the branch before merging? What's the backward-compatibility
path with the existing configurations for bind address, etc, where the port's specified? We
should be clear about which takes precedence, and throw errors on startup if both are configured,
I think?

Maybe it makes sense to change these to just be InetAddress instead of InetSocketAddress,
and never fill in a port there?

This patch should add the new config to hdfs-default, and edit the existing config's documentation
to explain how the two interact.

+      if (0 != interfaceStrs.length) {
+        LOG.info("Using interfaces [" +
+        Joiner.on(',').join(interfaceStrs)+ "] with addresses [" +
+        Joiner.on(',').join(interfaceAddrs) + "]");
+      }

- need indentation for the joiner lines
- add comment explaining how this eventually gets filled in, if it's empty?

+   * @param addrs socket addresses to convert
+   * @return an array of strings of IPs for the given addresses
+   */
+  public static String[] toIpAddrStrings(InetSocketAddress[] addrs) {

javadoc should specify that ports aren't included in the stringification of addresses
> Datanode should be able to register multiple network interfaces
> ---------------------------------------------------------------
>                 Key: HDFS-3146
>                 URL: https://issues.apache.org/jira/browse/HDFS-3146
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: data-node
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>         Attachments: hdfs-3146.txt
> The Datanode should register multiple interfaces with the Namenode (who then forwards
them to clients). We can do this by extending the DatanodeID, which currently just contains
a single interface, to contain a list of interfaces. For compatibility, the DatanodeID method
to get the DN address for data transfer should remain unchanged (multiple interfaces are only
used where the client explicitly takes advantage of them).
> By default, if the Datanode binds on all interfaces (via using the wildcard in the dfs*address
configuration) all interfaces are exposed, modulo ones like the loopback that should never
be exposed. Alternatively, a new configuration parameter ({{dfs.datanode.available.interfaces}})
allows the set of interfaces can be specified explicitly in case the user only wants to expose
a subset. If the new default behavior is too disruptive we could default dfs.datanode.available.interfaces
to be the IP of the IPC interface which is the only interface exposed today (per HADOOP-6867,
only the port from dfs.datanode.address is used today). 
> The interfaces can be specified by name (eg "eth0"), subinterface name (eg "eth0:0"),
or IP address. The IP address can be specified by range using CIDR notation so the configuration
values are portable.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message