hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rohit Jain <rohit.j...@esgyn.com>
Subject RE: Hbase on HDFS versus Cassandra
Date Wed, 30 Nov 2016 19:59:54 GMT
Frankly, I would look at the eco-system around HBase, and HDFS in general.  You have at least
four different SQL solutions on HBase. You have a number of Graph solutions on HBase.  Hive,
Spark, ... and a number of technologies / solutions supporting HDFS, support HBase as well.
  The HDFS eco-system supports other data structures / models as well, such as column stores,
etc.  Is eventually consistent important to you?  Is HBase the only HDFS centric technology
you will use, where other components of the distros look like "overhead" to you compared to
Cassandra?  Ultimately, what is your strategic architecture to support varied workloads and
data models on the same platform.  If you can answer that question, then the choices should
become easier.  As others have said you would get a pretty biased opinion from this group
since we have all committed to HBase for one reason or the other.  We committed to it also
because of the extensive eco-system and enterprise capabilities that are being built into
that eco-system, such as manageability, security, governance, etc. by the distro vendors.
 Things we can leverage to provide a full-fledged platform for Big Data to enterprises, and
not just a Big Table implementation.   Not sure how integrated Cassandra is into that entire


-----Original Message-----
From: Neelesh [mailto:neeleshs@gmail.com] 
Sent: Wednesday, November 30, 2016 10:08 AM
To: user@hbase.apache.org
Subject: Re: Hbase on HDFS versus Cassandra

We use both, in different capacities. Cassandra is an x-DC archive store
with mostly batch writes and occasional key based reads. Hbase is for
real-time event ingestion. Our experience so far on hbase + phoenix is that
when it works, it is fast and scales like crazy. But if you ever hit a snag
around data patterns, you will have a VERY hard time figuring out what's
going on. A combination of global phoenix indexes and heavy writes leave an
entire cluster sluggish, if there is a hint of hotspotting.

On the other hand, we had a big struggle getting Cassandra when a node
recovery was in progress. What with twice the amount of disk requirements
during recovery etc. Other than that, it is quiet.
But the access patterns are not the same.

I think the old rule still stays. If you are already on hadoop , or
interested in using/analysing data in several different ways, go with hbase
. If you just need a big data store with a few predefined query patterns,
Cassandra is good

Of course, I'm biased towards HBase.

On Nov 30, 2016 7:02 AM, "Mich Talebzadeh" <mich.talebzadeh@gmail.com>

> Hi Guys,
> Used Hbase on HDFS reasonably well. Happy to to stick with it and more with
> Hive/Phoenix views and Phoenix indexes where I can.
> I have a bunch of users now vocal about the use case for Cassandra and
> whether it can do a better job than Hbase.
> Unfortunately I am no expert on Cassandra. However, some use case fit would
> be very valuable.
> Thanks
> Dr Mich Talebzadeh
> LinkedIn * https://www.linkedin.com/profile/view?id=
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCd
> OABUrV8Pw>*
> http://talebzadehmich.wordpress.com
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
View raw message