accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From William Slacum <wilhelm.von.cl...@accumulo.net>
Subject Re: scan iterator that rolls up col vis
Date Wed, 02 Jul 2014 15:27:34 GMT
you should be able to roll up on keys with a condition similar to:

if( source.hasTop() ) {
  Key start = new Key(source.getTopKey()); // avoid instance-reuse issues
  long count = 0;
  while( source.hasTop() && start.equals( source.getTopKey(),
PartialKey.ROW_COLFAM_COLQUAL_COLVIS ) {
    count += deserialize(source.getTopValue());
    source.next();
  }
  Value new_top_value = serialize(count);
  // start can represent the top key of the iterator
}

We can flesh this out further if you run into issues. I think that we may
need to set the start key's timestamp to 0 so that it sorts after all the
other cells with a similar prefix.


On Tue, Jul 1, 2014 at 10:41 PM, Matthew Purdy <
mpurdy1973usergroups@gmail.com> wrote:

>
>
> USE CASE: on scan only; want to have a "summing combiner" that rolls
> up by (rowId, colfam, colqual) on all row keys where the client has
> visibility.
>
> below is a simple example that expresses the use case.
>
> accumulo table holding student to professor relationship by departments
>
>
> +----------+------------------+-----------+--------------+-----+
> |  rowId   |       colfam     |  colqual  |    colvis    | val |
> +----------+------------------+-----------+--------------+-----+
> | student1 | TAKES_CLASS_WITH |  prof1    | MATH_DEPT    |   1 |
> | student1 | TAKES_CLASS_WITH |  prof1    | MATH_DEPT    |   1 |
> | student1 | TAKES_CLASS_WITH |  prof1    | COM_SCI_DEPT |   1 |
> | student1 | TAKES_CLASS_WITH |  prof1    | COM_SCI_DEPT |   1 |
> | student2 | TAKES_CLASS_WITH |  prof1    | MATH_DEPT    |   1 |
> | student2 | TAKES_CLASS_WITH |  prof1    | COM_SCI_DEPT |   1 |
> +----------+------------------+-----------+--------------+-----+
>
>
> with the summing combiner the results would be
>
> +----------+------------------+-----------+--------------+-----+
> |  rowId   |       colfam     |  colqual  |    colvis    | val |
> +----------+------------------+-----------+--------------+-----+
> | student1 | TAKES_CLASS_WITH |  prof1    | MATH_DEPT    |   2 |
> | student1 | TAKES_CLASS_WITH |  prof1    | COM_SCI_DEPT |   2 |
> | student2 | TAKES_CLASS_WITH |  prof1    | MATH_DEPT    |   1 |
> | student2 | TAKES_CLASS_WITH |  prof1    | COM_SCI_DEPT |   1 |
> +----------+------------------+-----------+--------------+-----+
>
> - the math department can only see math department totals
> - the com sci department can only see the com sci department total
> - the office of the dean has both access
>
> therefore when scanning (it wouldnt work for compaction), how
> can you sum over colvis?
>
> assuming you had both colvis access the desired results would be:
>
> +----------+------------------+-----------+-----+
> |  rowId   |       colfam     |  colqual  | val |
> +----------+------------------+-----------------+
> | student1 | TAKES_CLASS_WITH |  prof1    |   4 |
> | student2 | TAKES_CLASS_WITH |  prof1    |   2 |
> +----------+------------------+-----------+-----+
>
>
>

Mime
View raw message