hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Bieniosek (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1638) Master node unable to bind to DNS hostname
Date Sat, 21 Jul 2007 20:38:06 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12514441

Michael Bieniosek commented on HADOOP-1638:

> This problem was caused by the changes made in Amazon EC2 addressing: previously instances
were direct addressed (given a single IP routable address) and now they are NAT-addressed
(by default, for later tool versions). The key point is that NAT-addressed instances can't
access other NAT-addressed instances using the public address.

I don't use the hadoop ec2 scripts, but I filed HADOOP-1202 specifically because of this issue.

The solution I intended with HADOOP-1202 was to make the namenode and jobtracker bind to
using my HADOOP-1202 patch, but use the internal addresses in the hadoop configs.  I set up
an http proxy to view logs for the datanodes and tasktrackers (I have my httpd.conf if anybody
is interested).  It is then possible to view the jobtracker & namenode website normally
(you have to submit jobs from inside the cluster though, since submitting a job writes to
the dfs).  The problem is that you can't use the dfs from outside the cluster; instead you
have to use some proxying solution which will be much slower (in our case it took longer to
copy data back than to compute it).  

If you need to use dfs, the real solution is to make all datanodes bind to, make the
namenode aware that each datanode has two addresses, and make sure the namenode knows when
to use which one.  This would require significantly more work than my HADOOP-1202 patch though.

> Master node unable to bind to DNS hostname
> ------------------------------------------
>                 Key: HADOOP-1638
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1638
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/ec2
>    Affects Versions: 0.13.0, 0.13.1, 0.14.0, 0.15.0
>            Reporter: Stu Hood
>            Priority: Minor
>             Fix For: 0.13.1, 0.14.0, 0.15.0
>         Attachments: hadoop-1638.patch
> With a release package of Hadoop 0.13.0 or with latest SVN, the Hadoop contrib/ec2 scripts
fail to start Hadoop correctly. After working around issues HADOOP-1634 and HADOOP-1635, and
setting up a DynDNS address pointing to the master's IP, the ec2/bin/start-hadoop script completes.
> But the cluster is unusable because the namenode and tasktracker have not started successfully.
Looking at the namenode log on the master reveals the following error:
> {quote}
> 2007-07-19 16:54:53,156 ERROR org.apache.hadoop.dfs.NameNode: java.net.BindException:
Cannot assign requested address
>         at sun.nio.ch.Net.bind(Native Method)
>         at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
>         at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
>         at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:186)
>         at org.apache.hadoop.ipc.Server.<init>(Server.java:631)
>         at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:325)
>         at org.apache.hadoop.ipc.RPC.getServer(RPC.java:295)
>         at org.apache.hadoop.dfs.NameNode.init(NameNode.java:164)
>         at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:211)
>         at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:803)
>         at org.apache.hadoop.dfs.NameNode.main(NameNode.java:811)
> {quote}
> The master node refuses to bind to the DynDNS hostname in the generated hadoop-site.xml.
Here is the relevant part of the generated file:
> {quote}
> <property>
>   <name>fs.default.name</name>
>   <value>blah-ec2.gotdns.org:50001</value>
> </property>
> <property>
>   <name>mapred.job.tracker</name>
>   <value>blah-ec2.gotdns.org:50002</value>
> </property>
> {quote}
> I'll attach a patch against hadoop-trunk that fixes the issue for me, but I'm not sure
if this issue is something that someone can fix more thoroughly.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message