accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher <>
Subject Re: Get list of unique values for given CF and CQ
Date Tue, 15 Oct 2013 07:29:27 GMT
There's no built-in iterator to do this. It's difficult to reliably
aggregate/combine data across rows. This might be better suited as a
MapReduce job rather than an iterator. Even if you do something clever
to aggregate within a tablet (like transform all matching keys to a
fixed R/CF/CQ and then using a combiner to group them within that
virtual row) and deal with the potential problems with that (reliably
transforming rowIds is especially tricky), you're still going to need
to aggregate across tablets  with some sort of client code... either
with MapReduce or a single-node client.

Christopher L Tubbs II

On Mon, Oct 14, 2013 at 12:06 PM, Korb, Michael [USA]
<> wrote:
> Given a specific CF and CQ, is there an iterator I can use to get all unique
> values across all rows?
> Example:
> row0 myCF:myCQ a
> row1 myCF:myCQ a
> row2 myCF:myCQ a
> row3 myCF:myCQ b
> row4 myCF:myCQ c
> row5 myCF:myCQ c
> I am interested in unique values associated with myCF:myCQ (irrelevant
> columns omitted from example).
> Result: a, b, c
> Thanks,
> Mike

View raw message