hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: Hadoop on EC2 - public AMIs in hadoop-images
Date Mon, 07 Sep 2009 16:40:40 GMT
On Mon, Sep 7, 2009 at 9:25 AM, John Clarke <clarkemjj@gmail.com> wrote:

> Thanks Todd, when you said "The EC2 scripts will boot Cloudera's
> distribution for Hadoop." were you referring to the EC2 scripts that come
> with Hadoop or Cloudera's python scripts?
>
>
Sorry that wasn't clear -- the Cloudera scripts are very similar to those in
the Apache distribution, but are updated more often since they're on a
separate release schedule. We try to contribute the changes back where
possible.


> Are there advantages to using Cloudera's scripts over the stock ones for a
> fairly standard job - i.e. up to 20 nodes and using S3 for input and
> output.
>
>
You'll take advantage of patches in our distribution that aren't in the
stock release. We include a number of patches (many written by the Amazon
guys) that are specific to the AWS/S3 environment. These are all currently
in trunk, but some will not be in an Apache release until 0.21. There's a
partial list of these patches about 2/3 down this page:
http://www.cloudera.com/hadoop-manifest

If you have more questions specific to our distribution, the best forum is
http://getsatisfaction.com/cloudera/products/cloudera_cloudera_s_distribution_for_hadoop

-Todd



2009/9/7 Todd Lipcon <todd@cloudera.com>

> On Mon, Sep 7, 2009 at 8:14 AM, John Clarke <clarkemjj@gmail.com> wrote:
>
> > Thanks for the replies, I've been developing against 0.18.3 under
Windows
> > XP
> > and testing on Ubuntu.
> >
> > That seems like a long list of changes from 0.18.3! Should I expect any
> > specific issues if I try Cloudera's version on EC2 seeing as I have only
> > tested against the stock 0.18.3?
> >
> >
> Everything should work fine if you've tested on Apache 0.18.3 - we
> guarantee
> API compatibility between our distribution and Apache's release. Sometimes
> we may add APIs, but nothing should be removed or altered. In our current
> "testing" level release (CDH2) we're now also running jdiff between the
> stock Apache release and our own so as to verify the above guarantee.
>
> -Todd
>
> >
> >
> >
> >
> > 2009/9/7 Todd Lipcon <todd@cloudera.com>
> >
> > > Hi,
> > >
> > > The EC2 scripts will boot Cloudera's distribution for Hadoop.
Currently
> > > they
> > > boot our distribution of 0.18.3, but 0.20 support should be ready
> pretty
> > > soon now. Here's a list of what patches are in our newest 0.18.3
> > > distribution:
> > >
> > > http://archive.cloudera.com/cdh/testing/hadoop-0.18.3+70.CHANGES.txt
> > >
> > > -Todd
> > >
> > > On Mon, Sep 7, 2009 at 7:30 AM, tim robertson <
> timrobertson100@gmail.com
> > > >wrote:
> > >
> > > > I can recommend the cloudera EC2 images.  I am not sure what version
> > > > they are built on right now, but I think they pick stable ones and
> > > > apply critical patches I believe.
> > > > http://www.cloudera.com/hadoop-ec2
> > > >
> > > > Cheers,
> > > > Tim
> > > >
> > > > On Mon, Sep 7, 2009 at 7:08 AM, John Clarke<clarkemjj@gmail.com>
> > wrote:
> > > > > Hi,
> > > > >
> > > > > I am planning on running my MapReduce app on Amazon's EC2. I had
a
> > look
> > > > at
> > > > > the public Hadoop images in the hadoop-images bucket and there is
> no
> > > > image
> > > > > for the stable 0.18.3 release. The most recent Hadoop versions I
> see
> > > are
> > > > > 0.18.1 and 0.19.0. Which of those would be better to use? Or
should
> I
> > > try
> > > > > and create my own AMI from one of the existing ones with the
stable
> > > > 0.18.3?
> > > > >
> > > > > Who owns the AMIs in hadoop-images? If they are owned by the
Hadoop
> > > > project
> > > > > I'm surprised there isn't one for the stable 0.18.3 or a later
> > 0.19.x.
> > > > >
> > > > > Thanks,
> > > > > John
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message