hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From igor Finkelshteyn <iefin...@gmail.com>
Subject Re: Hadoop on EC2 Managing Internal/External IPs
Date Fri, 24 Aug 2012 02:54:56 GMT
I've seen a bunch of people with this exact same question all over Google with no answers.
I know people have successful non-temporary clusters in EC2. Is there really no one that's
needed to deal with having EC2 expose external addresses instead of internal addresses before?
This seems like it should be a common thing.

On Aug 23, 2012, at 12:34 PM, igor Finkelshteyn wrote:

> Hi,
> I'm currently setting up a Hadoop cluster on EC2, and everything works just fine when
accessing the cluster from inside EC2, but as soon as I try to do something like upload a
file from an external client, I get timeout errors like:
> 12/08/23 12:06:16 ERROR hdfs.DFSClient: Failed to close file /user/some_file._COPYING_
> java.net.SocketTimeoutException: 65000 millis timeout while waiting for channel to be
ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.123.x.x:50010]
> What's clearly happening is my NameNode is resolving my DataNode's IPs to their internal
EC2 values instead of their external values, and then sending along the internal IP to my
external client, which is obviously unable to reach those. I'm thinking this must be a common
problem. How do other people deal with it? Is there a way to just force my name node to send
along my DataNode's hostname instead of IP, so that the hostname can be resolved properly
from whatever box will be sending files?
> Eli

View raw message