cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joseph Stein <crypt...@gmail.com>
Subject Re: The Difference Between Cassandra and HBase
Date Sun, 25 Apr 2010 16:33:16 GMT
it is kind of the classic distinction between OLTP & OLAP.

Cassandra is to OLTP as HBase is to OLAP (for those SAT nutz).

Both are useful and valuable in their own right, agreed.

On Sun, Apr 25, 2010 at 12:20 PM, Jeff Hodges <jhodges@twitter.com> wrote:
> HBase is awesome when you need high throughput and don't care so much
> about latency. Cassandra is generally the opposite. They are
> wonderfully complementary.
> --
> Jeff
>
> On Sun, Apr 25, 2010 at 8:19 AM, Lenin Gali <galilenin@gmail.com> wrote:
>> I second Joe.
>>
>> Lenin
>> Sent from my BlackBerry® wireless handheld
>>
>> -----Original Message-----
>> From: Joe Stump <joe@joestump.net>
>> Date: Sun, 25 Apr 2010 13:04:50
>> To: <user@cassandra.apache.org>
>> Subject: Re: The Difference Between Cassandra and HBase
>>
>>
>> On Apr 25, 2010, at 11:40 AM, Mark Robson wrote:
>>
>>> For me an important difference is that Cassandra is operationally much more straightforward
- there is only one type of node, and it is fully redundant (depending what consistency level
you're using).
>>>
>>> This seems to be an advantage in Cassandra vs most other distributed storage
systems, which almost all seem to require some "master" nodes which have different operational
requirements (e.g. cannot fail, need to be failed over manually or have another HA solution
installed for them)
>>
>> These two remain the #1 and #2 reasons I recommend Cassandra over HBase. At the end
of the day, Cassandra is an *absolute* dream to manage across multiple data centers. I could
go on and on about the voodoo that is expanding, contracting, and rebalancing a Cassandra
cluster. It's pretty awesome.
>>
>> That being said, we're getting ready to spin up an HBase cluster. If you're wanting
increment/decrement, more complex range scans, etc. then HBase is a great candidate. Especially
if you don't need it to span multiple data centers. We're using Cassandra for our main things,
and then HBase+Hive for analytics.
>>
>> There's room for both. Especially if you're using Hadoop with Cassandra.
>>
>> --Joe
>>
>>
>



-- 
/*
Joe Stein
http://www.linkedin.com/in/charmalloc
*/

Mime
View raw message