When I wrote my Why Cassandra article, I didn't get into the why I didn't choose x platform because I didn't want to start a flame war by doing comparisons. For HBase, the primary reason I didn't choose it is that while there were benchmarks of what it could theoretically do, there wasn't any real real world deployments proving it. My experience as a systems administrator is that it's best to go with a product that's been proven over time in real world scenarios.

I'll add to this though, that nothing nosql, even Cassandra, has reached the point where I feel it's no-brainer to choose it over anything, including sql based solutions like mysql and oracle. It really comes down to your requirements.

On Sat, Dec 5, 2009 at 11:04 PM, Matt Revelle <mrevelle@gmail.com> wrote:
On Dec 5, 2009, at 21:45, Joe Stump <joe@joestump.net> wrote:

On Dec 5, 2009, at 7:41 PM, Bill Hastings wrote:

[Is] HBase used for real timish applications and if so any ideas what the largest deployment is.

I don't know of anyone off the top of my head who's using anything built on top of Hadoop for a real-time environment. Hadoop just wasn't built for that. It was built, like MapReduce, for crunching absurd amounts of data across hundreds of nodes in a "reasonable" amount of time.

Just my $0.02.


While Hadoop MapReduce isn't meant for realtime use, HBase can handle it.

Over last summer there were some benchmarks included in HBase/Hadoop presentations that showed, IIRC, performance comparable to Cassandra.