hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ahad Rana" <ahadr...@gmail.com>
Subject Re: Accessing S3 with Hadoop?
Date Thu, 06 Sep 2007 21:07:16 GMT
Hi Toby / Tom,

The failure to copy during the reduce phase is one of the side affects of
the bug specified in HADOOP-1783. You should apply Tom's patch and then see
if the failure condition still exists.

Thanks,

Ahad.

On 9/6/07, Toby DiPasquale <codeslinger@gmail.com> wrote:
>
> On 9/6/07, Tom White <tom.e.white@gmail.com> wrote:
> > > Yeah, I actually read all of the wiki and your article about using
> > > Hadoop on EC2/S3 and I can't really find a reference to the S3 support
> > > not being for "regular" S3 keys. Did I miss something or should I
> > > update the wiki to make it more clear (or both)?
> >
> > I don't think this is explained clearly enough, so please do update
> > the wiki. Thanks.
>
> I just updated the page to add a Notes section explaining the issue
> and referencing the JIRA issue # you mentioned earlier.
>
> > > Also, the instructions on the EC2 page on the wiki no longer work, in
> > > that due to the kind of NAT Amazon is using, the slaves can't connect
> > > to the master using an externally-resolved IP address via a DNS name.
> > > What I mean is, if you set DNS to the external IP of your master
> > > instance, your slaves can resolve that address but cannot then connect
> > > to it. So, I had to alter the launch-hadoop-cluster and start-hadoop
> > > scripts and merge them to just pick the master and use its EC2-given
> > > name as the $MASTER_HOST to make it work.
> >
> > This sounds like the problem fixed in
> > https://issues.apache.org/jira/browse/HADOOP-1638 in 0.14.0, which is
> > the version you're using isn't it?
> >
> > Are you able to do 'bin/hadoop-ec2 launch-cluster' then (on your
> workstation)
> >
> > . bin/hadoop-ec2-env.sh
> > ssh $SSH_OPTS "root@$MASTER_HOST" "sed -i -e
> > \"s/$MASTER_HOST/\$(hostname)/g\"
> > /usr/local/hadoop-$HADOOP_VERSION/conf/hadoop-site.xml"
> >
> > and then check to see if the master host has been set correctly (to
> > the internal IP) in the master host's hadoop-site.xml.
>
> Well, no, since my $MASTER_HOST is now just the external DNS name of
> the first instance started in the reservation, but this is performed
> as part of my launch-hadoop-cluster script. In any case, that value is
> not set to the internal IP, but rather to the hostname portion of the
> internal DNS name.
>
> Currently, my MR jobs are failing because the reducers can't copy the
> map output and I'm thinking it might be because there is some kind of
> external address getting in there somehow. I see connections to
> external IPs in netstat -tan (72.* addresses). Any ideas about that?
> In the hadoop-site.xml's on the slaves, the address is the external
> DNS name of the master (ec2-*) but that resolves to the internal 10/8
> address like it should.
>
> > Also, what version of the EC2 tools are you using?
>
> black:~/code/hadoop-0.14.0/src/contrib/ec2> ec2-version
> 1.2-11797 2007-03-01
> black:~/code/hadoop-0.14.0/src/contrib/ec2>
>
> > > I also updated the scripts
> > > to only look for a given AMI ID and only start/manage/terminate
> > > instances of that AMI ID (since I have others I'd rather not
> > > terminated just on the basis of their AMI launch index ;-)).
> >
> > Instances are terminated on the basis of their AMI ID since 0.14.0.
> > See https://issues.apache.org/jira/browse/HADOOP-1504.
>
> I felt this was unsafe as it was, since it looked for a name of an
> image and then reversed it to the AMI ID. I just hacked it so you have
> to put in the AMI ID in hadoop-ec2-env.sh. Also, the script as it is
> right now doesn't grep for 'running' so may potentially shut down some
> instances starting up in another cluster. I may just be paranoid,
> however ;)
>
> --
> Toby DiPasquale
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message