accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Russ Weeks <>
Subject Re: Seeking Iterator
Date Fri, 09 Jan 2015 23:48:20 GMT
Hi, Eugene,

I think the conventional approach is to decompose your search area
(bounding box?) into a set of scan ranges that introduce minimal extraneous
curve segments, and then pass all those scan ranges into a BatchScanner.
The excellent Accumulo Recipes site has an example[1]. Does this approach
not work for you?

In general, your custom iterator should never try to seek to a row id
different from the current row id, because that row could be hosted by a
different tablet server.



On Fri, Jan 9, 2015 at 2:37 PM, Eugene Cheipesh <> wrote:

> Hello,
> I am attempting to write an Iterator based on a Z-curve index to search
> through multi-dimensional data. Essentially, given a record that I have
> encountered that is in the index range not in the multi-demensional query
> range I have a way to generate the next candidate record, potentially far
> ahead of the current point.
> Ideally I would be able to refine my search range with subsequent calls to
> seek(). It appears that Accumulo will create an iterator for every RFile
> (or some split other split point). The beginning of the range argument to
> seek will be the record at beginning of this split (which is good), however
> all instances of the iterator have the same, global range end (which is
> bad).
> I need to avoid the case where I seek past the range boundary of each
> individual iterator instance and throw a NullPointerException. Is there any
> way to get enough information to achieve this?
> Thank you,
> --
> Eugene Cheipesh

View raw message