incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Hauser <>
Subject Re: The Difference Between Cassandra and HBase
Date Sun, 25 Apr 2010 16:18:35 GMT
Out of curiosity, are you planning on copying the data you store in
HBase/Hive into separate Hadoop cluster in a different data center or
backing up HDFS in some other manner?  Redundancy isn't an issue within the
cluster; it's more a concern of storing all your HDFS data in one physical

On Sun, Apr 25, 2010 at 8:04 AM, Joe Stump <> wrote:

> On Apr 25, 2010, at 11:40 AM, Mark Robson wrote:
> > For me an important difference is that Cassandra is operationally much
> more straightforward - there is only one type of node, and it is fully
> redundant (depending what consistency level you're using).
> >
> > This seems to be an advantage in Cassandra vs most other distributed
> storage systems, which almost all seem to require some "master" nodes which
> have different operational requirements (e.g. cannot fail, need to be failed
> over manually or have another HA solution installed for them)
> These two remain the #1 and #2 reasons I recommend Cassandra over HBase. At
> the end of the day, Cassandra is an *absolute* dream to manage across
> multiple data centers. I could go on and on about the voodoo that is
> expanding, contracting, and rebalancing a Cassandra cluster. It's pretty
> awesome.
> That being said, we're getting ready to spin up an HBase cluster. If you're
> wanting increment/decrement, more complex range scans, etc. then HBase is a
> great candidate. Especially if you don't need it to span multiple data
> centers. We're using Cassandra for our main things, and then HBase+Hive for
> analytics.
> There's room for both. Especially if you're using Hadoop with Cassandra.
> --Joe

View raw message