hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@apache.org>
Subject Re: Performance of EC2
Date Fri, 29 Jan 2010 12:00:39 GMT
Something Something wrote:
> Wow.. how naive I am to think that I could trust Amazon.  Thanks for
> forwarding the links, Patrick.  Seems like Amazon's reliability has gone
> down considerably over the past few months.  (Occasionally my instances fail
> on startup or die in the middle for no apparent reason, and I used to think
> I was doing something dumb!)

That's unfair. Large datacentres are inherently unreliable, because we 
build out them out of "normal availability" stuff rather than HA 
hardware. This then pushes the problem of availability down to the 
applications, to you.

* Most of the problems people have been discussing are bandwidth issues; 
it may be that AWS is coming under some massive DDoS attack and you are 
seeing the fringes of it. It could be that your neighbours are noisy 
-but if you are running big Hadoop jobs, you are the noisy neighbour.

* A more likely problem for you is where your machines are placed. If 
they all share a single switch, very high bandwidth. But if they are on 
different racks, the network becomes the bottleneck.

> But what I don't understand is this... if I *reserve* an instance then I
> wouldn't be sharing its CPU with anyone, right?  The blog seems to indicate
> otherwise.

I think you only get exclusive use of a CPU when you rent an XL node. 
Reservations are a form of capacity planning, may or may not help with 
scheduling at all.

View raw message