Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: common-user@hadoop.apache.org
Received-SPF: neutral (athena.apache.org: 209.85.210.176 is neither permitted
 nor denied by domain of james@tynt.com)
References: <C93E2652.50654%Sudhir.Vallamkondu@icrossing.com>
From: James Seigel <james@tynt.com>
In-Reply-To: <C93E2652.50654%Sudhir.Vallamkondu@icrossing.com>
Mime-Version: 1.0 (iPhone Mail 8C148)
Date: Mon, 27 Dec 2010 12:04:01 -0700
Message-ID: <-5254625518205172075@unknownmsgid>
Subject: Re: Hadoop/Elastic MR on AWS
To: "common-user@hadoop.apache.org" <common-user@hadoop.apache.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Thank you for sharing.

Sent from my mobile. Please excuse the typos.

On 2010-12-27, at 11:18 AM, Sudhir Vallamkondu
<Sudhir.Vallamkondu@icrossing.com> wrote:

> We recently crossed this bridge and here are some insights. We did an
> extensive study comparing costs and benchmarking local vs EMR for our
> current needs and future trend.
>
> - Scalability you get with EMR is unmatched although you need to look at
> your requirement and decide this is something you need.
>
> - When using EMR its cheaper to use reserved instances vs nodes on the fl=
y.
> You can always add more nodes when required. I suggest looking at your
> current computing needs and reserve instances for a year or two and use
> these to run EMR and add nodes at peak needs. In your cost estimation you
> will need to factor in the data transfer time/costs unless you are dealin=
g
> with public datasets on S3
>
> - EMR fared similar to local cluster on CPU benchmarks (we used MRBench t=
o
> benchmark map/reduce) however IO benchmarks were slow on EMR (used DFSIO
> benchmark). For IO intensive jobs you will need to add more nodes to
> compensate this.
>
> - When compared to local cluster, you will need to factor the time it tak=
es
> for the EMR cluster to setup when starting a job. This like data transfer
> time, cluster replication time etc
>
> - EMR API is very flexible however you will need to build a custom interf=
ace
> on top of it to suit your job management and monitoring needs
>
> - EMR bootstrap actions can satisfy most of your native lib needs so no
> drawbacks there.
>
>
> -- Sudhir
>
>
> On 12/26/10 5:26 AM, "common-user-digest-help@hadoop.apache.org"
> <common-user-digest-help@hadoop.apache.org> wrote:
>
>> From: Otis Gospodnetic <otis_gospodnetic@yahoo.com>
>> Date: Fri, 24 Dec 2010 04:41:46 -0800 (PST)
>> To: <common-user@hadoop.apache.org>
>> Subject: Re: Hadoop/Elastic MR on AWS
>>
>> Hello Amandeep,
>>
>>
>>
>> ----- Original Message ----
>>> From: Amandeep Khurana <amansk@gmail.com>
>>> To: common-user@hadoop.apache.org
>>> Sent: Fri, December 10, 2010 1:14:45 AM
>>> Subject: Re: Hadoop/Elastic MR on AWS
>>>
>>> Mark,
>>>
>>> Using EMR makes it very easy to start a cluster and add/reduce  capacit=
y as
>>> and when required. There are certain optimizations that make EMR  an
>>> attractive choice as compared to building your own cluster out. Using  =
EMR
>>
>>
>> Could you please point out what optimizations you are referring to?
>>
>> Thanks,
>> Otis
>> ----
>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop - HBa=
se
>> Hadoop ecosystem search :: http://search-hadoop.com/
>>
>>> also ensures you are using a production quality, stable system backed b=
y  the
>>> EMR engineers. You can always use bootstrap actions to put your own  tw=
eaked
>>> version of Hadoop in there if you want to do that.
>>>
>>> Also, you  don't have to tear down your cluster after every job. You ca=
n set
>>> the alive  option when you start your cluster and it will stay there ev=
en
>>> after your  Hadoop job completes.
>>>
>>> If you face any issues with EMR, send me a mail  offline and I'll be ha=
ppy to
>>> help.
>>>
>>> -Amandeep
>>>
>>>
>>> On Thu, Dec 9,  2010 at 9:47 PM, Mark <static.void.dev@gmail.com>  wrot=
e:
>>>
>>>> Does anyone have any thoughts/experiences on running Hadoop  in AWS? W=
hat
>>>> are some pros/cons?
>>>>
>>>> Are there any good  AMI's out there for this?
>>>>
>>>> Thanks for any advice.
>>>>
>>>
>
>
> iCrossing Privileged and Confidential Information
> This email message is for the sole use of the intended recipient(s) and m=
ay contain confidential and privileged information of iCrossing. Any unauth=
orized review, use, disclosure or distribution is prohibited. If you are no=
t the intended recipient, please contact the sender by reply email and dest=
roy all copies of the original message.
>
>