hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Kerzner <markkerz...@gmail.com>
Subject Re: Time to build my own cluster - advice?
Date Thu, 05 Nov 2009 18:16:28 GMT
Philip,

it looks like a doable step up in control for me: don't use pre-built, roll
your own from pre-built components. I should have more flexibility that way.
Thank you for the links.

Mark

On Thu, Nov 5, 2009 at 11:16 AM, Philip Zeyliger <philip@cloudera.com>wrote:

> It's not too bad.  There are some notes at
> http://wiki.apache.org/hadoop/AmazonEC2, and some code in common's
> contrib directory:
> http://svn.apache.org/repos/asf/hadoop/common/trunk/src/contrib/ec2/
>
> Cloudera (my employer) publishes some scripts at
> http://archive.cloudera.com/docs/ec2.html that make it quite easy to
> get started.  It's a set of python scripts that, given the appropriate
> credentials, start and stop a cluster.  There are some hooks
> (http://archive.cloudera.com/docs/_customization.html) to trigger
> installation of custom packages.  What's going on underneath the
> scenes is that AMIs are being started, and they read from their
> "user_data" parameter a script which gets invoked at boot time.  This
> script knows enough to configure the cluster, and is easily
> customizable.
>
> -- Philip
>
> On Thu, Nov 5, 2009 at 9:09 AM, Mark Kerzner <markkerzner@gmail.com>
> wrote:
> > Hi,
> >
> > so far I've been using Amazon MapReduce. However, my app uses a growing
> > number of Linux packages. I have been installing them on the fly, in the
> > Mapper.configure(), but with OpenOffice this is hard, and I don't get a
> > service connection even after local install.
> >
> > Therefore, my question is on the advice in creating my own Hadoop cluster
> > out of EC2 machines. Are there instructions? How hard is it? What are
> best
> > practices?
> >
> > Thank you,
> > Mark
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message