hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: Hbase on Amazon S3?
Date Mon, 16 Nov 2009 18:06:21 GMT
The new scripts in trunk at src/contrib/ec2 will offer this approach soon. Right now they simply
back HDFS with instance storage (volatile) and rely on not having more than the HDFS replication
factor (default = 3) instances crash or terminate at one time. Using EBS is a big win for
its persistence and transparent/background snapshot facility. One thing our scripts will have
to deal with though is how to back a ~100 or so node cluster with EBS volumes, and also supporting
elastic operation, creating them on the fly as necessary. 

Also in the cards is performance and stability testing with HBase root filesystem on Hadoop's
S3N fs (http://wiki.apache.org/hadoop/AmazonS3). I tried some limited testing with the S3
fs option just for basic filesystem operations -- albeit on a 209 GB file -- and had an unhappy
result so will avoid that for now. Some time ago Clint Morgan ran a simple performance comparison
and here was his results: http://markmail.org/message/xqhwgdw25oi7u3rb
"So to summarize:
loading data: almost twice as slow
A long scan is about 1.5 times slower
short scans are over an order of magnitude slower
and random reads (done on the sorted "scan") are over 2 orders of
magnitude slower"

In some fairly short time we should have a replacement for the HBase S3 related page up on
the wiki. In the meantime you may consider perusing http://www.google.com/search?hl=en&q=hbase+s3

    - Andy




________________________________
From: Vaibhav Puranik <vpuranik@gmail.com>
To: hbase-user@hadoop.apache.org
Sent: Mon, November 16, 2009 9:46:52 AM
Subject: Re: Hbase on Amazon S3?

We have HBase 0.20.0 running on EC2 with EBS volume since July 2009.
We are using m1.Large machines for all the 4  nodes.

All of our data resides on EBS  volume. This helps us in backing up the
data. This also helps us in bringing up a separate cluster with the same
data for QA purposes.

So far no problems.

If you have any specific questions please let us know.

Regards,
Vaibhav Puranik
Gumgum



On Mon, Nov 16, 2009 at 9:38 AM, Something Something <
mailinglists19@gmail.com> wrote:

> Anyone installed HBase on S3 (or EC2 for that matter)?  Any pointers would
> be greatly appreciated.  Thanks.
>



      
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message