hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject RE: Question about HBase for OLTP
Date Tue, 10 Jan 2012 00:37:34 GMT

Ok..

Look, here's the thing... HBase has no transactional support.
OLTP systems like PoS systems, Hotel Reservation Systems, Trading systems... among others
really need this.

Again, I can't stress this point enough... DO NOT THINK ABOUT USING HBASE AS AN OLTP SYSTEM
UNLESS YOU HAVE ALREADY GONE THROUGH THE PROCESS OF DEMONSTRATING WHY YOU CAN NOT DO THIS
IN AN RDBMS.
Here Nicolas proves my point. FB in their example says that they couldn't fit their messages
in to an RDBMs. 

Since HBase lacks transactional support you need to deal with security so that only your application
is touching those tables which are part of your OLTP system.

Again, I cannot stress the importance in understanding that the term ROW LEVEL LOCKING means
one thing to HBase developers and another to RDBMs types. 

So I'll shut up now and let you young whipper snappers walk in to the mine field.  ;-)
The obvious take what I say with a grain of Kosher Salt and YMMV are included....

Enjoy

-Mike


> From: nspiegelberg@fb.com
> To: user@hbase.apache.org
> Subject: Re: Question about HBase for OLTP
> Date: Mon, 9 Jan 2012 23:18:39 +0000
> 
> 
> 1) Eventual Consistency isn't a problem here.  HBase is a strict
> consistency system.  Maybe you have us confused with other Dynamo-based
> Open Source projects?
> 2) MySQL and other traditional RDBMS systems are definitely a lot more
> solid, well-tested, and subtlety tuned than HBase.  The vast majority (if
> not all) of database systems developed in the past decade have this
> provlem.  HBase has 2 main advantages over a traditional RDBMS workload
> for OLTP:
>   A. Large-scale workloads : Facebook Messages have a constant growing set
> of data that is 1PB+.  And we're growing at 250MB/month.  This is hard to
> manage this with a traditional RDBMS.  Logical database sharding is
> extremely useful.
>   B. Write-dominated workloads : Examples like time-series databases, user
> analytics, etc are very write-heavy. A LSMT approach is architecturally
> better than a B-tree approach.  Having done system testing internally, we
> already see IOPS advantage with HBase over MySQL in writes.
> 3) A big question is what you need out of a database system.  Most web
> companies are worried about the 'large-scale workloads' problem if their
> site becomes popular, so a working familiarity with a distributed database
> system for less mission-critical applications is worthwhile even if the
> performance and reliability isn't there yet.
> 4) If you have any mission-critical data, you really should think about a
> disaster recovery plan outside of HBase, which is not as critical with a
> traditional RDBMS.  Facebook Messages ends up using Scribe as a backup
> mechanism.  We are currently working on HBase Snapshots to allow disaster
> recovery with HBase alone, but you shouldn't hedge bets on it being
> completed within your timeframe.
> 
> 
> On 1/9/12 2:31 PM, "Michael Segel" <michael_segel@hotmail.com> wrote:
> 
> >
> >
> >All, 
> >
> >Just my $0.02 worth of 'expertise'...
> >
> >1) Just because you can do something doesn't mean you should.
> >2) One should always try to use the right tool for the job regardless of
> >your 'fashion sense'.
> >3) Just because someone says "Facebook or Yahoo! does X", doesn't mean
> >its a good idea, or its the right choice for you and your team.
> >
> >Having said that...
> >
> >Yes, you can use HBase to handle OLTP queries. However you do not have
> >transactional capabilities built in such that you will have to manage
> >them within your application.
> >Not really an easy task when you think about it.  It really depends on
> >what you want to do with your OLTP system. Hotel reservation systems not
> >really a good idea....
> >
> >There are some inherent problems with HBase in an OLTP environment.
> >
> >1) Eventual consistency. You can google the CAP theorem and you'll see
> >why this is an issue.
> >2) Lack of transaction support. Note: Row Level Locking that is in HBase
> >has nothing to do with Row Level Locking with respect to transactional
> >support.  
> >3) HBase size and scale vs RDBMS. For OLTP, RDBMS is the best tool for
> >the job. So why do you want to use HBase over what one could call the
> >'defacto' standard?
> >The point here on #3 is that the normal tool of choice is an RDBMS. So
> >you really, really need to justify why you're not going with this. I mean
> >there could be a valid reason, but in most cases no.
> >
> >Where dhruba indicates that HBase is a pure transaction system, and does
> >support OLTP workloads... absolutely Not!
> >
> >So what I suggest is that if you want to do OLTP in HBase, the first
> >thing you have to do is to prove that you can't solve the problem in an
> >RDBMs.
> >
> >Having said all that... I'm going to shut now... ;-)
> >
> >-Mike
> >
> >
> >
> >> Date: Mon, 9 Jan 2012 10:55:45 -0800
> >> Subject: Re: Question about HBase for OLTP
> >> From: dhruba@gmail.com
> >> To: user@hbase.apache.org
> >> CC: hbase-user@hadoop.apache.org
> >> 
> >> > I know HBase is designed for OLAP, query intensive type of
> >>applications.
> >> 
> >> That is not entirely true. HBase is a pure transaction system and does
> >>OLTP
> >> workloads for us. We probably more than 2 millions ops/sec for one of
> >>our
> >> application, details here:
> >> https://www.facebook.com/note.php?note_id=454991608919
> >> 
> >> -dhruba
> >> 
> >> 
> >> On Mon, Jan 9, 2012 at 9:25 AM, fullysane <fullysane@msn.com> wrote:
> >> 
> >> >
> >> > Hi
> >> >
> >> > I know HBase is designed for OLAP, query intensive type of
> >>applications.
> >> > But
> >> > I like the flexibility feature of its column-base architecture which
> >>allows
> >> > me having no need to predefine every column of a table and I can
> >> > dynamically
> >> > add new column with value in my OLTP application code and capture its
> >>meta
> >> > data information.
> >> >
> >> > My question is basically about if we can use HBase for OLTP
> >>application
> >> > database. I know Hbase works well with Inserting column data of a row
> >>key
> >> > and set new version for the new piece of the data, and not so well for
> >> > updating and deleting existing piece of data. However, if I turn OLTP
> >> > update
> >> > and delete operations into all insertion of new version of colum data
> >>as I
> >> > described below:
> >> > For OLTP data update, if I set my table column family¹s versioning to
> >>1 and
> >> > always do insert (put) when there is need to update an existing data
> >>row
> >> > columns, and let Hbase to handle the delete of the old versions
> >>through DB
> >> > garbage collection.
> >> > For OLTP data delete, I can use inserting new version on a flag field
> >>to
> >> > ³deleted², which is a logical delete, and have some batch job to
> >>clean up
> >> > all logically deleted rows later.
> >> >
> >> > Will the above scenario work for using HBase for an OLTP application?
> >>Any
> >> > flaws on doing it?
> >> >
> >> > Can some one share the experiences of using HBase for OLTP
> >>applications?
> >> >
> >> > Thanks,
> >> >
> >> > --
> >> > View this message in context:
> >> > 
> >>http://old.nabble.com/Question-about-HBase-for-OLTP-tp33107782p33107782.h
> >>tml
> >> > Sent from the HBase User mailing list archive at Nabble.com.
> >> >
> >> >
> >> 
> >> 
> >> -- 
> >> Subscribe to my posts at http://www.facebook.com/dhruba
> > 		 	   		  
> 
 		 	   		  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message