hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julian Wissmann <julian.wissm...@sdace.de>
Subject Coprocessor Feature proposal
Date Thu, 28 Feb 2013 12:10:06 GMT
Hi,

for a research project I wrote a custom coprocessor, for which
ultimately, I just extended AggregationClient and
AggregateImplementation.
I needed two additional input parameters, a Long timePeriod over which
to aggregate and an int count to know how many aggregations to return,
the return value being a ConcurrentMap. The advantage of going with
this approach is, that for aggregations with a large count but small
periods, I don't need to either sort the data on the client side again
or do n aggregations resulting in n scans, but instead get the same
result with just one scan per region, which is a lot faster.

Right now, the code is a little messy, as it was implemented as a
quick and dirty proof of concept. However if there is interest in
having this ability in hbase, I'd be pleased to clean it up, port it
to head and release it into the wild.
I realize, that not everyone has a data pattern as simple as ours and
that this feature may not be overly useful to everyone. If anyone has
an idea as to how to extend this functionality to make it more useful,
let me know. I'm for example thinking about maybe having a more
generic approach with some sort of Filter or something along the lines
in order to not just being able to sort this for time periods but for
key patterns also.

Regards
Julian

Mime
View raw message