hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom White" <tom.e.wh...@gmail.com>
Subject Re: Issue with cluster over EC2 and different AMI types
Date Wed, 19 Mar 2008 08:42:44 GMT
Unfortunately there is no way to discover the rack that EC2 instances
are running on so you won't be able to use this optimization.

Tom

On 18/03/2008, Andrey Pankov <apankov@iponweb.net> wrote:
> Hi,
>
>  I'm apologize. It was my fault - I forgot to run tasktracker on slaves.
>  But anyway can anyone share his experience how to use rack?
>  Thanks.
>
>
>  Andrey Pankov wrote:
>  > Hi all,
>  >
>  > I'm trying to configure Hadoop cluster over Amazon EC2, one m1.small
>  > instance for master node, and some m1.large instances for slaves. Both
>  > master's on slaves's AMIs have the same version of Hadoop, 0.16.0.
>  >
>  > I run ec2 instances using ec2-run-instances, with the same --group
>  > parameter, but in two step, one call - run for master, second call - run
>  > for slaves.
>  >
>  > It looks like EC2 instances with different AMI types starting in
>  > different networks, for example external and internal DNS names:
>  >
>  >   * ec2-67-202-59-12.compute-1.amazonaws.com
>  >     ip-10-251-74-181.ec2.internal - for small instance
>  >   * ec2-67-202-3-191.compute-1.amazonaws.com
>  >     domU-12-31-38-00-5C-C1.compute-1.internal - for large
>  >
>  > The trouble is that slaves could not contact the master. When I specify
>  > fs.default.name parameter in hadoop-site.xml on slaves box to be full
>  > DNS name of master (either external or internal) and try to start
>  > datanode on it (bin/hadoop-daemon.sh ... start datanode), Hadoop
>  > replaces fs.default.name with just 'ip-10-251-74-181' and puts in log
>  >
>  > 2008-03-18 07:08:16,028 ERROR org.apache.hadoop.dfs.DataNode:
>  > java.net.UnknownHostException: unknown host: ip-10-251-74-181
>  > ...
>  >
>  > So DataNode could not be started.
>  >
>  > I tried to specify IP addr of ip-10-251-74-181 in /etc/hosts for each
>  > slave instance and it helped to start DataNode on slaves. And it became
>  > possible to store smth in HDFS. But. When I'm trying to run map-reduce
>  > job (in jar file), it doesn't work. I mean that jobs is still working
>  > but there is no any progress at all. Hadoop have written Map 0% Reduce
>  > 0% and just freeze.
>  >
>  > Can not not find anything in logs what could help a bit, both on master
>  > and on slave boxes.
>  >
>  > I found that dfs.network.script could be used to specify somehow a
>  > network location for a machine, but have no ideas now to use it. Can
>  > racks help me with it?
>  >
>  > Thanks in advance.
>  >
>  > ---
>  > Andrey Pankov
>  >
>  >
>
>
> ---
>
> Andrey Pankov
>


-- 
Blog: http://www.lexemetech.com/

Mime
View raw message