hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Marc Spaggiari <jean-m...@spaggiari.org>
Subject Aggregation
Date Wed, 13 Mar 2019 13:41:01 GMT

I have a quick question regarding aggregation.

First, let me explain my understanding. I see two types of aggregation.

First is at the column level. Like, AVG(age) on a table. It will, on the
server side, for each region, sum the age, and divide by the number of
rows. Fine.

Second is at the cell level. Imagine I want a counter. I do multiple puts
for the exact same cell. At compaction time, or at read time, there will be
an aggregation that will return only the sum of all those cells.

AggregateImplementation is an implementation of the first case. It runs as
a coprocessor EndPoint.

Do we have an implementation of the 2nd one? There can be many different
implementations. For counters, were we just put what ever and get an
incremental number. For accumulator, where we put numbers and get the sum
of all the numbers we have put. For average, where we put numbers and get
the average of all the puts (cell will store something like "sum|count").
etc. I looked at the existing coprocessors and I don't see anything like
that. Before starting to implement my own, I'm wondering if there is
already an existing solution.



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message