hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steph Gosling <st...@chuci.org>
Subject Re: HBase cluster over multiple EC2 Availability Zones?
Date Mon, 06 May 2013 22:26:15 GMT

On Mon, 6 May 2013 14:47:49 -0400
Otis Gospodnetic <otis.gospodnetic@gmail.com> wrote:

> Yup.  I'm suddenly turned off by that penny per GB! :)
> Ignoring the dollars, it sounds like one would just have to be OK with
> increased latencies, but technically nothing would break.  Doodling
> our architecture on paper here, I think we may as well just have
> complete, independent setups in multiple Regions then - I suspect
> those pennies add up faster than one would think.
> Otis

I run a couple of small HBase clusters (low double-digit nodes each),
and both span AZs in their respective regions (we're not doing any
inter-region stuff yet, nor do I expect to,TBH). What AWS don't tell
you is that not all instance types are available in all AZs,
particularly for the newer or more esoteric instances.

We care about this data so any performace hit (not that we've
particularly noticed one) because of cross AZ traffic is acceptable. We
do simple 'rack' awareness based on the AZ returned by the metadata
server, you could probably fine-tune that based on subnet if your
cluster got big but we've not had the need to. 

With regards to performance specifically, I've not looked explicitly
but I'd expect that you'll see far more variance based on things like
instance size, your neighbours on the same host and their behaviour.

Finally, I'm also surprised about the inter-AZ data charges, that seems
to be a very wide-spread misconception, and yeah I'd imagine the
pennies do add up...


Steph Gosling <steph@chuci.org>

View raw message