Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 68061 invoked from network); 21 Jul 2007 20:38:29 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 21 Jul 2007 20:38:29 -0000 Received: (qmail 8167 invoked by uid 500); 21 Jul 2007 20:38:29 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 8133 invoked by uid 500); 21 Jul 2007 20:38:29 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 8123 invoked by uid 99); 21 Jul 2007 20:38:29 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 21 Jul 2007 13:38:29 -0700 X-ASF-Spam-Status: No, hits=-100.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 21 Jul 2007 13:38:26 -0700 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 69B6D71422B for ; Sat, 21 Jul 2007 13:38:06 -0700 (PDT) Message-ID: <19540581.1185050286429.JavaMail.jira@brutus> Date: Sat, 21 Jul 2007 13:38:06 -0700 (PDT) From: "Michael Bieniosek (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-1638) Master node unable to bind to DNS hostname In-Reply-To: <11337177.1184879046149.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-1638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12514441 ] Michael Bieniosek commented on HADOOP-1638: ------------------------------------------- > This problem was caused by the changes made in Amazon EC2 addressing: previously instances were direct addressed (given a single IP routable address) and now they are NAT-addressed (by default, for later tool versions). The key point is that NAT-addressed instances can't access other NAT-addressed instances using the public address. I don't use the hadoop ec2 scripts, but I filed HADOOP-1202 specifically because of this issue. The solution I intended with HADOOP-1202 was to make the namenode and jobtracker bind to 0.0.0.0 using my HADOOP-1202 patch, but use the internal addresses in the hadoop configs. I set up an http proxy to view logs for the datanodes and tasktrackers (I have my httpd.conf if anybody is interested). It is then possible to view the jobtracker & namenode website normally (you have to submit jobs from inside the cluster though, since submitting a job writes to the dfs). The problem is that you can't use the dfs from outside the cluster; instead you have to use some proxying solution which will be much slower (in our case it took longer to copy data back than to compute it). If you need to use dfs, the real solution is to make all datanodes bind to 0.0.0.0, make the namenode aware that each datanode has two addresses, and make sure the namenode knows when to use which one. This would require significantly more work than my HADOOP-1202 patch though. > Master node unable to bind to DNS hostname > ------------------------------------------ > > Key: HADOOP-1638 > URL: https://issues.apache.org/jira/browse/HADOOP-1638 > Project: Hadoop > Issue Type: Bug > Components: contrib/ec2 > Affects Versions: 0.13.0, 0.13.1, 0.14.0, 0.15.0 > Reporter: Stu Hood > Priority: Minor > Fix For: 0.13.1, 0.14.0, 0.15.0 > > Attachments: hadoop-1638.patch > > > With a release package of Hadoop 0.13.0 or with latest SVN, the Hadoop contrib/ec2 scripts fail to start Hadoop correctly. After working around issues HADOOP-1634 and HADOOP-1635, and setting up a DynDNS address pointing to the master's IP, the ec2/bin/start-hadoop script completes. > But the cluster is unusable because the namenode and tasktracker have not started successfully. Looking at the namenode log on the master reveals the following error: > {quote} > 2007-07-19 16:54:53,156 ERROR org.apache.hadoop.dfs.NameNode: java.net.BindException: Cannot assign requested address > at sun.nio.ch.Net.bind(Native Method) > at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) > at org.apache.hadoop.ipc.Server$Listener.(Server.java:186) > at org.apache.hadoop.ipc.Server.(Server.java:631) > at org.apache.hadoop.ipc.RPC$Server.(RPC.java:325) > at org.apache.hadoop.ipc.RPC.getServer(RPC.java:295) > at org.apache.hadoop.dfs.NameNode.init(NameNode.java:164) > at org.apache.hadoop.dfs.NameNode.(NameNode.java:211) > at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:803) > at org.apache.hadoop.dfs.NameNode.main(NameNode.java:811) > {quote} > The master node refuses to bind to the DynDNS hostname in the generated hadoop-site.xml. Here is the relevant part of the generated file: > {quote} > > fs.default.name > blah-ec2.gotdns.org:50001 > > > mapred.job.tracker > blah-ec2.gotdns.org:50002 > > {quote} > I'll attach a patch against hadoop-trunk that fixes the issue for me, but I'm not sure if this issue is something that someone can fix more thoroughly. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.