hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Duxbury <br...@rapleaf.com>
Subject Re: Blog post about when to use HBase
Date Tue, 13 May 2008 17:20:40 GMT
I think that the determining factor of when you should use HBase  
instead of HDFS files is really the consumption pattern. If you're  
only ever going to process the data in bulk, then chances are you'll  
get the most performance out of a raw HDFS file. However, if you need  
to have random access to some of the entries, then HBase will give  
you significant benefit.

There are other factors that go into this decision. One that I can  
think of off the top of my head is if you'd like to take advantage of  
the versioning and semi-defined schema of HBase for your dataset. It  
would be a little complicated to duplicate all of that logic on your  
own from a flat file.

Another factor is your system's workflow. If you use HDFS files, you  
need to be ok with always rewriting the files to do any "updates". So  
even if you only add 1MB worth of new data to a 1TB dataset, you have  
to rewrite the whole thing. HBase would let you "insert" it where it  
belongs. (Of course, HBase has the same constraints as your  
applications do, except we've already done the work to manage random  

Does this help you out?


On May 13, 2008, at 10:13 AM, Naama Kraus wrote:

> Hi,
> Can anyone say some words on when to use HBase as opposed to using  
> Plain
> MapReduce on input files ?
> In more details, when will it make sense to put data into HBase and  
> then use
> HBase methods to access it, including running MapReduce on the data  
> in the
> tables. As opposed to simply putting the data into HDFS and  
> processing it
> with MapReduce.
> Thanks, Naama
> On Wed, Mar 12, 2008 at 12:15 AM, Bryan Duxbury <bryan@rapleaf.com>  
> wrote:
>> I've written up a blog post discussing when I think it's  
>> appropriate to
>> use HBase in response to some of the questions people usually ask.  
>> You can
>> find it at http://blog.rapleaf.com/dev/?p=26.
>> -Bryan
> -- 
> oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00  
> oo 00 oo
> 00 oo 00 oo
> "If you want your children to be intelligent, read them fairy  
> tales. If you
> want them to be more intelligent, read them more fairy tales." (Albert
> Einstein)

View raw message