hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Eng <a...@maprtech.com>
Subject Re: Hadoop/Elastic MR on AWS
Date Thu, 09 Dec 2010 18:57:55 GMT
- Easier to build out and tear down clusters vs. using physical machines in
a lab
- Easier to scale up and scale down a cluster as needed

- Reliability.  In my experience I've had machines die, had machines fail to
start up, had network outages between Amazon instances, etc.  These problems
have occurred at a far more significant rate than any physical lab I have
ever administered.
- Money. You get charged for problems with their system.  Need to add
storage space to a node?  That means renting space from EBS which you then
need to actually spend time formatting to ext3 so you can use it with
Hadoop.  So every time you want to use storage, you're paying Amazon to
format it because you can't tell EBS that you want an ext3 volume.
- Visibility.  Amazon loves to report that all their services are working
properly on their website, meanwhile, the reality is that they only report
issues if they are extremely major.  Just yesterday they reported "increased
latency" on their us-east-1 region.  In reality, "increased latency" means
>50% of my Amazon API calls were timing out, I could not create new
instances and for about 2 hours I could not destroy the instances I had
already spun up.  Hows that for ya?  Paying them for machines that they
won't let me terminate...

This applies to both EMR and clusters you'd create yourself in EC2.  So if
you're willing to put up with not having much control over or insight into
the environment you're using, Amazon may be a good bet.  But don't expect it
to be all rainbows and daisies, you will run into problems at various points
which you did not cause and can not correct yourself, you'll have to wait
for Amazon to get their environment functioning.

On Thu, Dec 9, 2010 at 8:17 AM, Mark <static.void.dev@gmail.com> wrote:

> Does anyone have any thoughts/experiences on running Hadoop in AWS? What
> are some pros/cons?
> Are there any good AMI's out there for this?
> Thanks for any advice.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message