Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9A26799CD for ; Tue, 10 Jan 2012 00:38:06 +0000 (UTC) Received: (qmail 1172 invoked by uid 500); 10 Jan 2012 00:38:05 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 1054 invoked by uid 500); 10 Jan 2012 00:38:04 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 1046 invoked by uid 99); 10 Jan 2012 00:38:04 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Jan 2012 00:38:04 +0000 X-ASF-Spam-Status: No, hits=3.2 required=5.0 tests=FREEMAIL_REPLY,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of michael_segel@hotmail.com designates 65.54.190.95 as permitted sender) Received: from [65.54.190.95] (HELO bay0-omc2-s20.bay0.hotmail.com) (65.54.190.95) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Jan 2012 00:37:55 +0000 Received: from BAY170-W72 ([65.54.190.125]) by bay0-omc2-s20.bay0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Mon, 9 Jan 2012 16:37:33 -0800 Message-ID: Content-Type: multipart/alternative; boundary="_751c59d2-c75f-4255-9b00-f470b1c9cab8_" X-Originating-IP: [173.15.87.37] From: Michael Segel To: Subject: RE: Question about HBase for OLTP Date: Mon, 9 Jan 2012 18:37:34 -0600 Importance: Normal In-Reply-To: References: , MIME-Version: 1.0 X-OriginalArrivalTime: 10 Jan 2012 00:37:33.0954 (UTC) FILETIME=[0B295620:01CCCF30] X-Virus-Checked: Checked by ClamAV on apache.org --_751c59d2-c75f-4255-9b00-f470b1c9cab8_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Ok.. Look=2C here's the thing... HBase has no transactional support. OLTP systems like PoS systems=2C Hotel Reservation Systems=2C Trading syste= ms... among others really need this. Again=2C I can't stress this point enough... DO NOT THINK ABOUT USING HBASE= AS AN OLTP SYSTEM UNLESS YOU HAVE ALREADY GONE THROUGH THE PROCESS OF DEMO= NSTRATING WHY YOU CAN NOT DO THIS IN AN RDBMS. Here Nicolas proves my point. FB in their example says that they couldn't f= it their messages in to an RDBMs.=20 Since HBase lacks transactional support you need to deal with security so t= hat only your application is touching those tables which are part of your O= LTP system. Again=2C I cannot stress the importance in understanding that the term ROW = LEVEL LOCKING means one thing to HBase developers and another to RDBMs type= s.=20 So I'll shut up now and let you young whipper snappers walk in to the mine = field. =3B-) The obvious take what I say with a grain of Kosher Salt and YMMV are includ= ed.... Enjoy -Mike > From: nspiegelberg@fb.com > To: user@hbase.apache.org > Subject: Re: Question about HBase for OLTP > Date: Mon=2C 9 Jan 2012 23:18:39 +0000 >=20 >=20 > 1) Eventual Consistency isn't a problem here. HBase is a strict > consistency system. Maybe you have us confused with other Dynamo-based > Open Source projects? > 2) MySQL and other traditional RDBMS systems are definitely a lot more > solid=2C well-tested=2C and subtlety tuned than HBase. The vast majority= (if > not all) of database systems developed in the past decade have this > provlem. HBase has 2 main advantages over a traditional RDBMS workload > for OLTP: > A. Large-scale workloads : Facebook Messages have a constant growing se= t > of data that is 1PB+. And we're growing at 250MB/month. This is hard to > manage this with a traditional RDBMS. Logical database sharding is > extremely useful. > B. Write-dominated workloads : Examples like time-series databases=2C u= ser > analytics=2C etc are very write-heavy. A LSMT approach is architecturally > better than a B-tree approach. Having done system testing internally=2C = we > already see IOPS advantage with HBase over MySQL in writes. > 3) A big question is what you need out of a database system. Most web > companies are worried about the 'large-scale workloads' problem if their > site becomes popular=2C so a working familiarity with a distributed datab= ase > system for less mission-critical applications is worthwhile even if the > performance and reliability isn't there yet. > 4) If you have any mission-critical data=2C you really should think about= a > disaster recovery plan outside of HBase=2C which is not as critical with = a > traditional RDBMS. Facebook Messages ends up using Scribe as a backup > mechanism. We are currently working on HBase Snapshots to allow disaster > recovery with HBase alone=2C but you shouldn't hedge bets on it being > completed within your timeframe. >=20 >=20 > On 1/9/12 2:31 PM=2C "Michael Segel" wrote: >=20 > > > > > >All=2C=20 > > > >Just my $0.02 worth of 'expertise'... > > > >1) Just because you can do something doesn't mean you should. > >2) One should always try to use the right tool for the job regardless of > >your 'fashion sense'. > >3) Just because someone says "Facebook or Yahoo! does X"=2C doesn't mean > >its a good idea=2C or its the right choice for you and your team. > > > >Having said that... > > > >Yes=2C you can use HBase to handle OLTP queries. However you do not have > >transactional capabilities built in such that you will have to manage > >them within your application. > >Not really an easy task when you think about it. It really depends on > >what you want to do with your OLTP system. Hotel reservation systems not > >really a good idea.... > > > >There are some inherent problems with HBase in an OLTP environment. > > > >1) Eventual consistency. You can google the CAP theorem and you'll see > >why this is an issue. > >2) Lack of transaction support. Note: Row Level Locking that is in HBase > >has nothing to do with Row Level Locking with respect to transactional > >support. =20 > >3) HBase size and scale vs RDBMS. For OLTP=2C RDBMS is the best tool for > >the job. So why do you want to use HBase over what one could call the > >'defacto' standard? > >The point here on #3 is that the normal tool of choice is an RDBMS. So > >you really=2C really need to justify why you're not going with this. I m= ean > >there could be a valid reason=2C but in most cases no. > > > >Where dhruba indicates that HBase is a pure transaction system=2C and do= es > >support OLTP workloads... absolutely Not! > > > >So what I suggest is that if you want to do OLTP in HBase=2C the first > >thing you have to do is to prove that you can't solve the problem in an > >RDBMs. > > > >Having said all that... I'm going to shut now... =3B-) > > > >-Mike > > > > > > > >> Date: Mon=2C 9 Jan 2012 10:55:45 -0800 > >> Subject: Re: Question about HBase for OLTP > >> From: dhruba@gmail.com > >> To: user@hbase.apache.org > >> CC: hbase-user@hadoop.apache.org > >>=20 > >> > I know HBase is designed for OLAP=2C query intensive type of > >>applications. > >>=20 > >> That is not entirely true. HBase is a pure transaction system and does > >>OLTP > >> workloads for us. We probably more than 2 millions ops/sec for one of > >>our > >> application=2C details here: > >> https://www.facebook.com/note.php?note_id=3D454991608919 > >>=20 > >> -dhruba > >>=20 > >>=20 > >> On Mon=2C Jan 9=2C 2012 at 9:25 AM=2C fullysane wr= ote: > >>=20 > >> > > >> > Hi > >> > > >> > I know HBase is designed for OLAP=2C query intensive type of > >>applications. > >> > But > >> > I like the flexibility feature of its column-base architecture which > >>allows > >> > me having no need to predefine every column of a table and I can > >> > dynamically > >> > add new column with value in my OLTP application code and capture it= s > >>meta > >> > data information. > >> > > >> > My question is basically about if we can use HBase for OLTP > >>application > >> > database. I know Hbase works well with Inserting column data of a ro= w > >>key > >> > and set new version for the new piece of the data=2C and not so well= for > >> > updating and deleting existing piece of data. However=2C if I turn O= LTP > >> > update > >> > and delete operations into all insertion of new version of colum dat= a > >>as I > >> > described below: > >> > For OLTP data update=2C if I set my table column family=B9s versioni= ng to > >>1 and > >> > always do insert (put) when there is need to update an existing data > >>row > >> > columns=2C and let Hbase to handle the delete of the old versions > >>through DB > >> > garbage collection. > >> > For OLTP data delete=2C I can use inserting new version on a flag fi= eld > >>to > >> > =B3deleted=B2=2C which is a logical delete=2C and have some batch jo= b to > >>clean up > >> > all logically deleted rows later. > >> > > >> > Will the above scenario work for using HBase for an OLTP application= ? > >>Any > >> > flaws on doing it? > >> > > >> > Can some one share the experiences of using HBase for OLTP > >>applications? > >> > > >> > Thanks=2C > >> > > >> > -- > >> > View this message in context: > >> >=20 > >>http://old.nabble.com/Question-about-HBase-for-OLTP-tp33107782p33107782= .h > >>tml > >> > Sent from the HBase User mailing list archive at Nabble.com. > >> > > >> > > >>=20 > >>=20 > >> --=20 > >> Subscribe to my posts at http://www.facebook.com/dhruba > > =20 >=20 = --_751c59d2-c75f-4255-9b00-f470b1c9cab8_--