accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Russ Weeks <rwe...@newbrightidea.com>
Subject Re: Column Pagination iterator
Date Thu, 23 Jul 2015 02:11:50 GMT
Thanks for your response, Keith. Your suggestion to implement paging by
refining the scan range makes a lot of sense. Maybe I'm just getting to
caught up in mirroring Titan's HBase adaptor, I wonder why they've
implemented it on the server-side.

I hadn't considered the IsolatedScanner, in fact I've never used it before.
Can I ask, what sort of black magic is happening in the Tablet servers to
implement that isolation? Is it somehow snapshotting the tablet prior to
running the scan?

Regards,
-Russ

On Wed, Jul 22, 2015 at 12:17 PM Keith Turner <keith@deenlo.com> wrote:

> On Wed, Jul 22, 2015 at 2:22 PM, Russ Weeks <rweeks@newbrightidea.com>
> wrote:
>
> > Hey, folks,
> >
> > Any ideas how I might go about implementing a column pagination filter
> > similar to HBase's [1]? Translated to Accumulo, this would be an iterator
> > that skips the first m columns in a row and returns the next n columns.
> >
> > The catch as far as I can tell is that Accumulo could re-seek the
> iterator
> > at any time, screwing up the internal count of how many columns have been
> > seen. I guess the only way to resolve that would be to force every seek
> to
> > start at the beginning of a row, and the filter logic would only pass a
> KV
> > pair if it's in both the pagination range and the seek range.
> >
>
> An iterator will not be reseeked unless it returns something.  So when
> skipping the 1st M columns of a row, the iterator would not be torn down
> and reseeked.  However when returning the N columns, the iterator could be
> torn down and reseeked.
>
> Since you are working within a row, there are two ways to avoid this.   You
> can use an IsolatedScanner which will prevent the iterator from being torn
> down within a row.   Alternatively, you could wrap your special iterator
> with a WholeRowIterator.
>
> Curious, would seeking a scanner to the last row:column seen (non
> inclusive) and reading N column from the scanner work?
>
>
> >
> > This work is in the context of ACCUMULO-638 (and ATLAS-40) which I'll
> take
> > ownership of as soon as I make a little more headway...
> >
> > 1:
> >
> >
> https://github.com/apache/hbase/blob/branch-1.0/hbase-client/src/main/java/org/apache/hadoop/hbase/filter/ColumnPaginationFilter.java
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message