Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hadoop-dev@lucene.apache.org
Message-ID: <19540581.1185050286429.JavaMail.jira@brutus>
Date: Sat, 21 Jul 2007 13:38:06 -0700 (PDT)
From: "Michael Bieniosek (JIRA)" <jira@apache.org>
To: hadoop-dev@lucene.apache.org
Subject: [jira] Commented: (HADOOP-1638) Master node unable to bind to DNS
 hostname
In-Reply-To: <11337177.1184879046149.JavaMail.jira@brutus>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HADOOP-1638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12514441 ] 

Michael Bieniosek commented on HADOOP-1638:
-------------------------------------------

> This problem was caused by the changes made in Amazon EC2 addressing: previously instances were direct addressed (given a single IP routable address) and now they are NAT-addressed (by default, for later tool versions). The key point is that NAT-addressed instances can't access other NAT-addressed instances using the public address.

I don't use the hadoop ec2 scripts, but I filed HADOOP-1202 specifically because of this issue.

The solution I intended with HADOOP-1202 was to make the namenode and jobtracker bind to 0.0.0.0 using my HADOOP-1202 patch, but use the internal addresses in the hadoop configs.  I set up an http proxy to view logs for the datanodes and tasktrackers (I have my httpd.conf if anybody is interested).  It is then possible to view the jobtracker & namenode website normally (you have to submit jobs from inside the cluster though, since submitting a job writes to the dfs).  The problem is that you can't use the dfs from outside the cluster; instead you have to use some proxying solution which will be much slower (in our case it took longer to copy data back than to compute it).  

If you need to use dfs, the real solution is to make all datanodes bind to 0.0.0.0, make the namenode aware that each datanode has two addresses, and make sure the namenode knows when to use which one.  This would require significantly more work than my HADOOP-1202 patch though.


> Master node unable to bind to DNS hostname
> ------------------------------------------
>
>                 Key: HADOOP-1638
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1638
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/ec2
>    Affects Versions: 0.13.0, 0.13.1, 0.14.0, 0.15.0
>            Reporter: Stu Hood
>            Priority: Minor
>             Fix For: 0.13.1, 0.14.0, 0.15.0
>
>         Attachments: hadoop-1638.patch
>
>
> With a release package of Hadoop 0.13.0 or with latest SVN, the Hadoop contrib/ec2 scripts fail to start Hadoop correctly. After working around issues HADOOP-1634 and HADOOP-1635, and setting up a DynDNS address pointing to the master's IP, the ec2/bin/start-hadoop script completes.
> But the cluster is unusable because the namenode and tasktracker have not started successfully. Looking at the namenode log on the master reveals the following error:
> {quote}
> 2007-07-19 16:54:53,156 ERROR org.apache.hadoop.dfs.NameNode: java.net.BindException: Cannot assign requested address
>         at sun.nio.ch.Net.bind(Native Method)
>         at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>         at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>         at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:186)
>         at org.apache.hadoop.ipc.Server.<init>(Server.java:631)
>         at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:325)
>         at org.apache.hadoop.ipc.RPC.getServer(RPC.java:295)
>         at org.apache.hadoop.dfs.NameNode.init(NameNode.java:164)
>         at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:211)
>         at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:803)
>         at org.apache.hadoop.dfs.NameNode.main(NameNode.java:811)
> {quote}
> The master node refuses to bind to the DynDNS hostname in the generated hadoop-site.xml. Here is the relevant part of the generated file:
> {quote}
> <property>
>   <name>fs.default.name</name>
>   <value>blah-ec2.gotdns.org:50001</value>
> </property>
> <property>
>   <name>mapred.job.tracker</name>
>   <value>blah-ec2.gotdns.org:50002</value>
> </property>
> {quote}
> I'll attach a patch against hadoop-trunk that fixes the issue for me, but I'm not sure if this issue is something that someone can fix more thoroughly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.