accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher <ctubb...@apache.org>
Subject Re: Distinguishing between processed and unprocessed data in an Iterator
Date Wed, 01 Oct 2014 04:07:39 GMT
Without looking at the code, I can't recall the specific circumstances
where that might occur (maybe continueScan?), but no API guarantees are
made regarding that, so even if Accumulo itself didn't do that, it could
change in a different version.


--
Christopher L Tubbs II
http://gravatar.com/ctubbsii

On Tue, Sep 30, 2014 at 10:13 PM, Russ Weeks <rweeks@newbrightidea.com>
wrote:

> I see, thanks Christopher.
>
> For a lot of the iterators that my colleagues and I are thinking about,
> we'd be OK with constraints like, "only apply this iterator at scan time"
> and "don't stick other iterators on top of this one". But my understanding
> is that Accumulo itself, either on the tserver-side and/or the
> scanner-side, might arbitrarily re-seek any type of iterator at any time it
> chooses.
>
> -Russ
>
> On Tue, Sep 30, 2014 at 6:41 PM, Christopher <ctubbsii@apache.org> wrote:
>
>> On Tue, Sep 30, 2014 at 9:34 PM, Russ Weeks <rweeks@newbrightidea.com>
>> wrote:
>>
>>> > an iterator in the scan scope would be guaranteed to only see
>>> unprocessed data if the iterator has not been configured for minor
>>> compaction or major compaction scopes at all
>>>
>>> Excellent, thanks Christopher. That simplifies things. One more
>>> question: I understand that an iterator may be re-seeked at any point in
>>> its lifetime, which could cause it to see unprocessed data a second time. I
>>> assume this is true for scan-scope iterators as well?
>>>
>>> -Russ
>>>
>>>
>> Yes, it is true for scan scope also. A simple example would be another
>> user iterator that sits on top of yours that does a Cartesian product of
>> its data source.
>>
>> --
>> Christopher L Tubbs II
>> http://gravatar.com/ctubbsii
>>
>>
>

Mime
View raw message