hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bradford Stephens <bradfordsteph...@gmail.com>
Subject HBase and Web-Scale BI
Date Thu, 26 Feb 2009 05:02:09 GMT

I'm in charge of the data analysis and collection platform at my company,
and we're basing a large part of our core analysis platform on Hadoop,
Nutch, and Lucene -- it's a delight to use. However, we're going to be
wanting some on-demand "web-scale" business intelligence, and I'm wondering
if HBase is the right solution -- my research hasn't given me any

Our data set is pretty simple -- a bunch of XML documents which have been
parsed from HTML pages, and some associated data (Author Name, Post Date,
Influence, etc). What we would like to be able to do is have our end users
do real-time (< 10 seconds) OLAP-type analysis on this, and have it
presented on a webpage. For example, queries like ("All authors for the past
two weeks who have used these keywords in the post bodies and what their
influence score is"). I imagine we'll have several terabytes of data to go
through, and we won't be able to do much pre-population of results.

Is HBase low-latency enough that we can scale-out to solve these sorts of


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message