hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicolas Spiegelberg <nspiegelb...@fb.com>
Subject Re: Question about HBase for OLTP
Date Mon, 09 Jan 2012 23:18:39 GMT

1) Eventual Consistency isn't a problem here.  HBase is a strict
consistency system.  Maybe you have us confused with other Dynamo-based
Open Source projects?
2) MySQL and other traditional RDBMS systems are definitely a lot more
solid, well-tested, and subtlety tuned than HBase.  The vast majority (if
not all) of database systems developed in the past decade have this
provlem.  HBase has 2 main advantages over a traditional RDBMS workload
for OLTP:
  A. Large-scale workloads : Facebook Messages have a constant growing set
of data that is 1PB+.  And we're growing at 250MB/month.  This is hard to
manage this with a traditional RDBMS.  Logical database sharding is
extremely useful.
  B. Write-dominated workloads : Examples like time-series databases, user
analytics, etc are very write-heavy. A LSMT approach is architecturally
better than a B-tree approach.  Having done system testing internally, we
already see IOPS advantage with HBase over MySQL in writes.
3) A big question is what you need out of a database system.  Most web
companies are worried about the 'large-scale workloads' problem if their
site becomes popular, so a working familiarity with a distributed database
system for less mission-critical applications is worthwhile even if the
performance and reliability isn't there yet.
4) If you have any mission-critical data, you really should think about a
disaster recovery plan outside of HBase, which is not as critical with a
traditional RDBMS.  Facebook Messages ends up using Scribe as a backup
mechanism.  We are currently working on HBase Snapshots to allow disaster
recovery with HBase alone, but you shouldn't hedge bets on it being
completed within your timeframe.

On 1/9/12 2:31 PM, "Michael Segel" <michael_segel@hotmail.com> wrote:

>Just my $0.02 worth of 'expertise'...
>1) Just because you can do something doesn't mean you should.
>2) One should always try to use the right tool for the job regardless of
>your 'fashion sense'.
>3) Just because someone says "Facebook or Yahoo! does X", doesn't mean
>its a good idea, or its the right choice for you and your team.
>Having said that...
>Yes, you can use HBase to handle OLTP queries. However you do not have
>transactional capabilities built in such that you will have to manage
>them within your application.
>Not really an easy task when you think about it.  It really depends on
>what you want to do with your OLTP system. Hotel reservation systems not
>really a good idea....
>There are some inherent problems with HBase in an OLTP environment.
>1) Eventual consistency. You can google the CAP theorem and you'll see
>why this is an issue.
>2) Lack of transaction support. Note: Row Level Locking that is in HBase
>has nothing to do with Row Level Locking with respect to transactional
>3) HBase size and scale vs RDBMS. For OLTP, RDBMS is the best tool for
>the job. So why do you want to use HBase over what one could call the
>'defacto' standard?
>The point here on #3 is that the normal tool of choice is an RDBMS. So
>you really, really need to justify why you're not going with this. I mean
>there could be a valid reason, but in most cases no.
>Where dhruba indicates that HBase is a pure transaction system, and does
>support OLTP workloads... absolutely Not!
>So what I suggest is that if you want to do OLTP in HBase, the first
>thing you have to do is to prove that you can't solve the problem in an
>Having said all that... I'm going to shut now... ;-)
>> Date: Mon, 9 Jan 2012 10:55:45 -0800
>> Subject: Re: Question about HBase for OLTP
>> From: dhruba@gmail.com
>> To: user@hbase.apache.org
>> CC: hbase-user@hadoop.apache.org
>> > I know HBase is designed for OLAP, query intensive type of
>> That is not entirely true. HBase is a pure transaction system and does
>> workloads for us. We probably more than 2 millions ops/sec for one of
>> application, details here:
>> https://www.facebook.com/note.php?note_id=454991608919
>> -dhruba
>> On Mon, Jan 9, 2012 at 9:25 AM, fullysane <fullysane@msn.com> wrote:
>> >
>> > Hi
>> >
>> > I know HBase is designed for OLAP, query intensive type of
>> > But
>> > I like the flexibility feature of its column-base architecture which
>> > me having no need to predefine every column of a table and I can
>> > dynamically
>> > add new column with value in my OLTP application code and capture its
>> > data information.
>> >
>> > My question is basically about if we can use HBase for OLTP
>> > database. I know Hbase works well with Inserting column data of a row
>> > and set new version for the new piece of the data, and not so well for
>> > updating and deleting existing piece of data. However, if I turn OLTP
>> > update
>> > and delete operations into all insertion of new version of colum data
>>as I
>> > described below:
>> > For OLTP data update, if I set my table column family¹s versioning to
>>1 and
>> > always do insert (put) when there is need to update an existing data
>> > columns, and let Hbase to handle the delete of the old versions
>>through DB
>> > garbage collection.
>> > For OLTP data delete, I can use inserting new version on a flag field
>> > ³deleted², which is a logical delete, and have some batch job to
>>clean up
>> > all logically deleted rows later.
>> >
>> > Will the above scenario work for using HBase for an OLTP application?
>> > flaws on doing it?
>> >
>> > Can some one share the experiences of using HBase for OLTP
>> >
>> > Thanks,
>> >
>> > --
>> > View this message in context:
>> > 
>> > Sent from the HBase User mailing list archive at Nabble.com.
>> >
>> >
>> -- 
>> Subscribe to my posts at http://www.facebook.com/dhruba

View raw message