accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Turner <>
Subject Re: Column Pagination iterator
Date Wed, 22 Jul 2015 19:17:17 GMT
On Wed, Jul 22, 2015 at 2:22 PM, Russ Weeks <>

> Hey, folks,
> Any ideas how I might go about implementing a column pagination filter
> similar to HBase's [1]? Translated to Accumulo, this would be an iterator
> that skips the first m columns in a row and returns the next n columns.
> The catch as far as I can tell is that Accumulo could re-seek the iterator
> at any time, screwing up the internal count of how many columns have been
> seen. I guess the only way to resolve that would be to force every seek to
> start at the beginning of a row, and the filter logic would only pass a KV
> pair if it's in both the pagination range and the seek range.

An iterator will not be reseeked unless it returns something.  So when
skipping the 1st M columns of a row, the iterator would not be torn down
and reseeked.  However when returning the N columns, the iterator could be
torn down and reseeked.

Since you are working within a row, there are two ways to avoid this.   You
can use an IsolatedScanner which will prevent the iterator from being torn
down within a row.   Alternatively, you could wrap your special iterator
with a WholeRowIterator.

Curious, would seeking a scanner to the last row:column seen (non
inclusive) and reading N column from the scanner work?

> This work is in the context of ACCUMULO-638 (and ATLAS-40) which I'll take
> ownership of as soon as I make a little more headway...
> 1:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message