Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E58E5DEA4 for ; Fri, 24 Aug 2012 03:10:08 +0000 (UTC) Received: (qmail 98542 invoked by uid 500); 24 Aug 2012 03:10:04 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 98343 invoked by uid 500); 24 Aug 2012 03:10:03 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 98314 invoked by uid 99); 24 Aug 2012 03:10:02 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 Aug 2012 03:10:02 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of iefinkel@gmail.com designates 209.85.160.48 as permitted sender) Received: from [209.85.160.48] (HELO mail-pb0-f48.google.com) (209.85.160.48) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 Aug 2012 03:09:54 +0000 Received: by pbbrq13 with SMTP id rq13so3096601pbb.35 for ; Thu, 23 Aug 2012 20:09:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:mime-version:content-type:subject:date:in-reply-to:to :references:message-id:x-mailer; bh=yF5gylY0D1tHNh9HeYxpoWit4EWGaFs6pR5S5eQwS0w=; b=YSYHhToxXufkOZrpiSPrNvCBdVV9zm6HbNFhhIlKhf7J4xHULphlard0Zg1Hf7YVXX /3/OYLiNSwuYUZ0/1QMB8rKQHZIk7OWnUc0bikY3LgNXjXRCNlQWEQedcOM8v1DrSfGd 5UjnQSGhDJJ3ISZ00VvV3jlba3JQnxEha5thsX56N+R4MEdYeTzO4vjDbxUxaL0uUIaY x77hvJIJ3Z5LAm5SHPH/+p+uAT26mi6X5u1Iq8W9WgESgW48DZm5osc4oK0Nz7Nu1SlV siCl+pkLhZnNtXiKcF0Dbo8ZgmhVy98JpGvWyMWhPQ5gP117AYU7Y1K3o4EBHX23lMMe K8fg== Received: by 10.66.78.99 with SMTP id a3mr7646273pax.22.1345777774237; Thu, 23 Aug 2012 20:09:34 -0700 (PDT) Received: from [192.168.1.69] ([67.218.98.82]) by mx.google.com with ESMTPS id ty1sm7283053pbc.76.2012.08.23.20.09.33 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 23 Aug 2012 20:09:33 -0700 (PDT) From: igor Finkelshteyn Mime-Version: 1.0 (Apple Message framework v1280) Content-Type: multipart/alternative; boundary="Apple-Mail=_84870A6F-D763-4C5B-81D9-07E8C73BF1C0" Subject: Re: Hadoop on EC2 Managing Internal/External IPs Date: Thu, 23 Aug 2012 20:09:32 -0700 In-Reply-To: To: user@hadoop.apache.org References: Message-Id: X-Mailer: Apple Mail (2.1280) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_84870A6F-D763-4C5B-81D9-07E8C73BF1C0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 That would work, but wouldn't a much simpler solution just be to force = the machines in the cluster to always pass around their external FQDNs, = since those will properly resolve to the internal or external IP = depending on what machine is asking? Is there no way to just do that? On Aug 23, 2012, at 8:02 PM, Aaron Eng wrote: > Hi Igor, >=20 > Amazon offers a service where you can have a VPN gateway on your = network that leads directly back to the network where youre instances = are at. So that 10.123.x.x subnet would be connected off of the VPN = gateway on your network and you'd set up your routers/routing to push = traffic for that subnet at the gateway. =20 >=20 > On Thu, Aug 23, 2012 at 12:34 PM, igor Finkelshteyn = wrote: > Hi, > I'm currently setting up a Hadoop cluster on EC2, and everything works = just fine when accessing the cluster from inside EC2, but as soon as I = try to do something like upload a file from an external client, I get = timeout errors like: >=20 > 12/08/23 12:06:16 ERROR hdfs.DFSClient: Failed to close file = /user/some_file._COPYING_ > java.net.SocketTimeoutException: 65000 millis timeout while waiting = for channel to be ready for connect. ch : = java.nio.channels.SocketChannel[connection-pending = remote=3D/10.123.x.x:50010] >=20 > What's clearly happening is my NameNode is resolving my DataNode's IPs = to their internal EC2 values instead of their external values, and then = sending along the internal IP to my external client, which is obviously = unable to reach those. I'm thinking this must be a common problem. How = do other people deal with it? Is there a way to just force my name node = to send along my DataNode's hostname instead of IP, so that the hostname = can be resolved properly from whatever box will be sending files? >=20 > Eli >=20 --Apple-Mail=_84870A6F-D763-4C5B-81D9-07E8C73BF1C0 Content-Transfer-Encoding: 7bit Content-Type: text/html; charset=iso-8859-1
That would work, but wouldn't a much simpler solution just be to force the machines in the cluster to always pass around their external FQDNs, since those will properly resolve to the internal or external IP depending on what machine is asking? Is there no way to just do that?


On Aug 23, 2012, at 8:02 PM, Aaron Eng wrote:

Hi Igor,

Amazon offers a service where you can have a VPN gateway on your network that leads directly back to the network where youre instances are at.  So that 10.123.x.x subnet would be connected off of the VPN gateway on your network and you'd set up your routers/routing to push traffic for that subnet at the gateway.  

On Thu, Aug 23, 2012 at 12:34 PM, igor Finkelshteyn <iefinkel@gmail.com> wrote:
Hi,
I'm currently setting up a Hadoop cluster on EC2, and everything works just fine when accessing the cluster from inside EC2, but as soon as I try to do something like upload a file from an external client, I get timeout errors like:

12/08/23 12:06:16 ERROR hdfs.DFSClient: Failed to close file /user/some_file._COPYING_
java.net.SocketTimeoutException: 65000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.123.x.x:50010]

What's clearly happening is my NameNode is resolving my DataNode's IPs to their internal EC2 values instead of their external values, and then sending along the internal IP to my external client, which is obviously unable to reach those. I'm thinking this must be a common problem. How do other people deal with it? Is there a way to just force my name node to send along my DataNode's hostname instead of IP, so that the hostname can be resolved properly from whatever box will be sending files?

Eli


--Apple-Mail=_84870A6F-D763-4C5B-81D9-07E8C73BF1C0--