hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Whether by applying HBase, an application still needs RDBMS?
Date Wed, 07 Jul 2010 21:35:35 GMT
FWIW, one could use Cassandra, HBase, MongoDB and MySQL to support a
single product if they are used in a way that makes sense WRT their
features. The downside will obviously be maintaining radically
different systems.

Our experience at StumbleUpon is that our business has been built on
top of MySQL for years now. On the other hand, we needed a scalable
platform that grows with us. Ryan Rawson did the research and we
adopted HBase as our solution (along other reasons), so we are
confronted every now and then with the choice of porting or not
certain features on to HBase, same for our new products.

What happened is that since we launched su.pr a year ago (which is
completely served by HBase), almost all the other new features we
developed are also stored in HBase simply because most of the time
they don't really require joins and the data is retrieved by row key
(bonus: it's even stored in a scalable database). We have been more
reluctant to port older more mission-critical stuff to HBase since
HDFS lacks fsSync support, but with hadoop 0.20-append and 0.21 this
is now history. Will we now port every single line of code to HBase?
Probably not, because some parts don't really require a lot of storage
and their design was so challenging to do that re-doing it all over
again would be risky/costly.

Hope that helps!


tl;dr StumbleUpon runs on top of a HBase/MySQL hybrid, with the HBase
component becoming bigger everyday, up to a point where only a small
MySQL usage should remain.

On Wed, Jul 7, 2010 at 1:13 PM,  <nocones77-groups@yahoo.com> wrote:
>> From: Jean-Daniel Cryans <jdcryans@apache.org>
>> Also, about using any other DBMS in conjunction with  HBase, I would
>> simply recommend using the right tool for the  right  job.
> This seems like a sensible approach to me.
> We are using HBase for the data that needs massive scalability, but are using a
> standard RDBMS (MySQL) for most of our needs. The RDBMS is perfectly capable of
> handling reasonably large data sets...just not the truly massive ones. MySQL has
> more/better developed tools, supports convenient features like indexes and
> triggers, and is much easier to deal with, administer, etc. It is currently
> faster to develop and easier to find people who know the technology. So we
> default to using RDBMS, but use HBase for data structures where we know we'll
> exceed RDBMS capabilities.
> But I'm curious what other people think as well...anyone? Do you use HBase for
> your whole app? Or do you have a hybrid approach?
> Neal

View raw message