hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: HBase / HDFS on EBS?
Date Tue, 04 Jan 2011 18:43:25 GMT
I don't have a whole lot of recent HBase on EBS experience, but when I
did do it my main issue was that sometimes some EBS volumes would
become unavailable.

The way I see it is that you have an additional moving part in your
whole stack, thus there's a chance it will generate a new set of
problems (compared to using local disks).

J-D

On Tue, Jan 4, 2011 at 9:43 AM, Otis Gospodnetic
<otis_gospodnetic@yahoo.com> wrote:
> Hi,
>
> What do people think about running HBase / HDFS off of EBS on EC2?  That is,
> having HBase/HDFS keep the data on EBS.
> I was surprised not to find a lot of discussion around that:
>  http://search-hadoop.com/?q=%2Bebs+%2Bhdfs
>
> Here are my thoughts/questions:
>
> * Supposedly ephemeral disks can be faster, but EC2 claims EBS is faster.
> People who benchmarked EBS mention its performance varies a lot.  Local disks
> suffer from noisy neighbour problem, no?
>
> * EBS disks are not local.  They are far from the CPU.  What happens with data
> locality if you have data on EBS?
>
> * MR jobs typically read and write a lot.  I wonder if this ends up being very
> expensive?
>
> * Data on ephemeral disks is lost when an instance terminates.  Do people really
> rely purely on having N DNs and high enough replication factor to prevent data
> loss?
>
> * With EBS you could just create a larger volume when you need more disk space
> and attach it to your existing DN.  If you are running out of disk space on
> local disks, what are the options?  Got to launch more EC2 instances even if all
> you need is disk space, not more CPUs?
>
> Thanks,
> Otis
> ----
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>

Mime
View raw message