accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <>
Subject Re: Abnormal behaviour of custom iterator in getting entries
Date Mon, 15 Jun 2015 15:51:53 GMT
It's hard to remotely debug an iterator, especially when we don't know 
what it's doing. If you can post the code, that would help tremendously. 
Instead of dumping values to a text file, you may fare better by 
attaching a remote debugger to the TabletServer and setting a breakpoint 
on your SKVI.

The only thing I can say is that a Scanner and BatchScanner should 
return the same data, but the invocations in the server to fetch that 
data are performed differently. It's likely that due to the differences 
in the implementations, you uncovered a bug in your iterator.

One common pitfall is incorrectly handling something we refer to as a 
"re-seek". Hypothetically, take a query scanning over [0, 9], and we 
have one key per number in the range (10 keys).

As the name implies, the BatchScanner fetches batches from a server, and 
suppose that after 3 keys, the server-side buffer fills up. Thus, the 
client will get keys [0,2]. In the server, the next time you fetch a 
batch, a new instance of the iterator will be constructed (via 
deepCopy()). Seek() will then be called, but with a new range that 
represents the previous data that was already returned. Thus, your 
iterator would be seeked with (2,9] instead of [0,9] again.

I can't say whether or not you're actually hitting this case, but it's a 
common pitfall that affects devs.

madhvi wrote:
> @josh
> If after hasTop and getTopKey, seek would have called then this should
> also be written in call hierarchy.
> Because i have written all the function hierarchy in a file.
> so the problem if i have called myFunction() in seek.
> And after seek getTopKey and getTopValue then hasTop and next should be
> called but what is happening sometime getTopValue is called sometime
> not. This is happening when i am reading entries through batchscanner.
> getTopValue function is called while scanning through scanner, Applying
> same iterator using scanner and batchsacnner, through scanner getting
> returned entries but getting no entries returned while using batchscanner.
> So can you please explain.

View raw message