hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Rodionov <vladrodio...@gmail.com>
Subject Re: Aggregation
Date Wed, 13 Mar 2019 17:45:03 GMT
Hi, Jean-Marc

I am mot aware about implementation of #2 in HBase. In RocksDB there is a
Merge operator which does exactly what you need.
It can be done in HBase as well with a help of a specialized coprocessor.
RocksDB Merge:
https://github.com/facebook/rocksdb/wiki/Merge-Operator

-Vlad


On Wed, Mar 13, 2019 at 6:41 AM Jean-Marc Spaggiari <jean-marc@spaggiari.org>
wrote:

> Hi,
>
> I have a quick question regarding aggregation.
>
> First, let me explain my understanding. I see two types of aggregation.
>
> First is at the column level. Like, AVG(age) on a table. It will, on the
> server side, for each region, sum the age, and divide by the number of
> rows. Fine.
>
> Second is at the cell level. Imagine I want a counter. I do multiple puts
> for the exact same cell. At compaction time, or at read time, there will be
> an aggregation that will return only the sum of all those cells.
>
> AggregateImplementation is an implementation of the first case. It runs as
> a coprocessor EndPoint.
>
> Do we have an implementation of the 2nd one? There can be many different
> implementations. For counters, were we just put what ever and get an
> incremental number. For accumulator, where we put numbers and get the sum
> of all the numbers we have put. For average, where we put numbers and get
> the average of all the puts (cell will store something like "sum|count").
> etc. I looked at the existing coprocessors and I don't see anything like
> that. Before starting to implement my own, I'm wondering if there is
> already an existing solution.
>
> Thanks,
>
> JMS
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message