hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom White (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1638) Master node unable to bind to DNS hostname
Date Sat, 21 Jul 2007 16:20:06 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12514425

Tom White commented on HADOOP-1638:

This problem was caused by the changes made in Amazon EC2 addressing: previously instances
were direct addressed (given a single IP routable address) and now they are NAT-addressed
(by default, for later tool versions). The key point is that NAT-addressed instances can't
access other NAT-addressed instances using the public address. Direct addressing is going
to be phased out. See http://developer.amazonwebservices.com/connect/entry.jspa?externalID=682&categoryID=100
for more details. 

Tools versions ec2-api-tools-1.2-9739 and later use NAT addressing, and I have been using
ec2-api-tools-1.2-7546 (although I thought I had been using ec2-api-tools-1.2-9739) which
still uses direct addressing.

I don't think HADOOP-1202 will make this a non-issue since EC2 NAT instances cannot route
to the public address of other instances. So even if the namenode and job tracker could bind
to the public address that would not be much help to the slaves since they have to connect
to the internal address - so this patch would still be needed.

Stu, I agree that it would be nice to fix this problem more thoroughly but until we have a
better solution I think this approach is fine.

I've tested with the last three versions of ec2-api-tools and have successfully run the grep
example on small multi-node clusters. When NAT-addressing is used however the webservers on
datanodes and task trackers are not accessible since non-routable addresses are used. Apart
from this limitation (which can be worked around by logging in to the relevant machine to
browse logs) jobs ran OK.

So I vote to commit this (along with HADOOP-1635, HADOOP-1634) - I'll have some time to do
this tomorrow.

> Master node unable to bind to DNS hostname
> ------------------------------------------
>                 Key: HADOOP-1638
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1638
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/ec2
>    Affects Versions: 0.13.0, 0.13.1, 0.14.0, 0.15.0
>            Reporter: Stu Hood
>            Priority: Minor
>             Fix For: 0.13.1, 0.14.0, 0.15.0
>         Attachments: hadoop-1638.patch
> With a release package of Hadoop 0.13.0 or with latest SVN, the Hadoop contrib/ec2 scripts
fail to start Hadoop correctly. After working around issues HADOOP-1634 and HADOOP-1635, and
setting up a DynDNS address pointing to the master's IP, the ec2/bin/start-hadoop script completes.
> But the cluster is unusable because the namenode and tasktracker have not started successfully.
Looking at the namenode log on the master reveals the following error:
> {quote}
> 2007-07-19 16:54:53,156 ERROR org.apache.hadoop.dfs.NameNode: java.net.BindException:
Cannot assign requested address
>         at sun.nio.ch.Net.bind(Native Method)
>         at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>         at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>         at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:186)
>         at org.apache.hadoop.ipc.Server.<init>(Server.java:631)
>         at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:325)
>         at org.apache.hadoop.ipc.RPC.getServer(RPC.java:295)
>         at org.apache.hadoop.dfs.NameNode.init(NameNode.java:164)
>         at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:211)
>         at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:803)
>         at org.apache.hadoop.dfs.NameNode.main(NameNode.java:811)
> {quote}
> The master node refuses to bind to the DynDNS hostname in the generated hadoop-site.xml.
Here is the relevant part of the generated file:
> {quote}
> <property>
>   <name>fs.default.name</name>
>   <value>blah-ec2.gotdns.org:50001</value>
> </property>
> <property>
>   <name>mapred.job.tracker</name>
>   <value>blah-ec2.gotdns.org:50002</value>
> </property>
> {quote}
> I'll attach a patch against hadoop-trunk that fixes the issue for me, but I'm not sure
if this issue is something that someone can fix more thoroughly.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message