hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Royston Sellman <royston.sell...@googlemail.com>
Subject Aggregations in HBase
Date Sun, 11 Dec 2011 18:52:39 GMT
I'm a newbie learning HBase using 0.90.4. Got my data bulk loading nicely into a cluster and
now I want to have simple SQL-like aggregations (SUM, AVG, STD, MIN, MAX, MEDIAN, WEIGHTED
MEDIAN etc) working. I started off trying to build MR code to do this but stumbled across
AggregateProtocol and AggregateImplementation and AggregationClient in 0.94. Before I re-invent
the wheel I'd like to check out this coprocessor aggregator functionality but I'm finding
it a bit hard to get into.

It seems there has been a bit of discussion recommending that aggregations are done on the
server and queried by client code. Looking at the code this seems to be the way it is architected
in 0.94. Am I right about this? Is there a summary of the discussion anywhere?

I'm guessing I have to build 0.94 on my system to try the Aggregation coprocessor stuff out.
Am I right or has it been backported to a release bundle? (I've never built HBase before,
only used releases, it will take me a while to do a build and then put it on my cluster)

Is there any user documentation for the Aggregations stuff? I can't find any but don't know
if I've looked in all the right places. 

Some of the answers may be in the user mailing list. Is there an easy way to search this list?
I tried GMANE and search-hadoop but didn't get much from either. Is reading the code my best

Grateful for any pointers on this topic.


View raw message