accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Hughes <jn...@virginia.edu>
Subject Re: Iterators that alter key-values
Date Fri, 15 May 2015 17:55:49 GMT
Hi Dave,

The big thing to note is that your iterator stack may get stopped and torn
down for various reasons.  As Accumulo recreates the stack, it will call
'seek' with the last emitted key in order to resume.

If you are returning keys out of order in an iterator, the 'seek' method
needs to be able to undo the transformation and call 'seek' appropriately.
That's not impossible, but it isn't trivial.

In GeoMesa, we did something like that at one point (without having a smart
'seek').  I enjoyed two days of debugging trying to figure out why medium
sized requests would hang.  (There was an infinite loop....)  From that
experience, I'd suggest only modifying values.

Cheers,

Jim


On Fri, May 15, 2015 at 1:26 PM, Dave Hardcastle <hardcastle.dave@gmail.com>
wrote:

> Hi,
>
> I've always assumed that the last iterator in the stack can make arbitrary
> changes to keys and values, including not returning the keys in sorted
> order. I know that SortedKeyValueIterator says that "anything implementing
> this interface should return keys in sorted order" - but I don't see a good
> reason that has to be true for the final iterator. This assumption seems to
> be backed up by the manual which says that "the only safe way to generate
> additional data in an iterator is to alter the current key-value pair" - it
> doesn't say that making arbitrary modifications to the rowkey or key is
> forbidden.
>
> I have a situation where I am making a transformation of the rowkey that
> may not preserve the ordering of the keys. When I scan for individual
> ranges I get the correct results. When I scan for two ranges using a
> BatchScanner, I get lots of data back which is not in the ranges I queried
> for. I am not explicitly checking that I have not gone beyond the range,
> but that should not be necessary as I am not doing any seeking, only
> consuming the key-values I receive.
>
> So, my main question is whether the last iterator is allowed to not return
> keys in sorted order?
>
> Thanks,
>
> Dave.
>

Mime
View raw message