accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From madhvi <>
Subject Re: Abnormal behaviour of custom iterator in getting entries
Date Tue, 16 Jun 2015 04:42:35 GMT
Thanks Josh.

Outline of my code is:

public class TestIterator extends WrappingIterator {

HashMap<String, Integer> holder = new HashMap<>();
private Iterator<Map.Entry<String, Integer>> entries=null;
private Entry<String, Integer> entry=null;
private Key emitKey;
private Value emitValue;

public void seek(Range range, Collection<ByteSequence> columnFamilies, 
boolean inclusive) throws IOException {, columnFamilies, inclusive);

//matched the condition and put values to holder map.
entries = holder.entrySet().iterator();//iterate the map holder.

       public Key getTopKey() {
           return emitKey;

       public Value getTopValue() {
         return emitValue;

       public boolean hasTop() {
           return entries.hasNext();

       public void next() throws IOException {
               entry =;
                //put the keys of map to rowid and values of map to 
columnqualifier through emitKey
               emitKey = new Key(new Text(entry.getKey()), new Text(), 
new Text(String.valueOf(entry.getValue())));
               //return 1 in emitValue.
               emitValue = new Value("1".getBytes());
           catch(Exception e)

This code returning result while using scanner and but not in case of 
And how enable remote debugger in accumulo.


On Monday 15 June 2015 09:21 PM, Josh Elser wrote:
> It's hard to remotely debug an iterator, especially when we don't know 
> what it's doing. If you can post the code, that would help 
> tremendously. Instead of dumping values to a text file, you may fare 
> better by attaching a remote debugger to the TabletServer and setting 
> a breakpoint on your SKVI.
> The only thing I can say is that a Scanner and BatchScanner should 
> return the same data, but the invocations in the server to fetch that 
> data are performed differently. It's likely that due to the 
> differences in the implementations, you uncovered a bug in your iterator.
> One common pitfall is incorrectly handling something we refer to as a 
> "re-seek". Hypothetically, take a query scanning over [0, 9], and we 
> have one key per number in the range (10 keys).
> As the name implies, the BatchScanner fetches batches from a server, 
> and suppose that after 3 keys, the server-side buffer fills up. Thus, 
> the client will get keys [0,2]. In the server, the next time you fetch 
> a batch, a new instance of the iterator will be constructed (via 
> deepCopy()). Seek() will then be called, but with a new range that 
> represents the previous data that was already returned. Thus, your 
> iterator would be seeked with (2,9] instead of [0,9] again.
> I can't say whether or not you're actually hitting this case, but it's 
> a common pitfall that affects devs.
> madhvi wrote:
>> @josh
>> If after hasTop and getTopKey, seek would have called then this should
>> also be written in call hierarchy.
>> Because i have written all the function hierarchy in a file.
>> so the problem if i have called myFunction() in seek.
>> And after seek getTopKey and getTopValue then hasTop and next should be
>> called but what is happening sometime getTopValue is called sometime
>> not. This is happening when i am reading entries through batchscanner.
>> getTopValue function is called while scanning through scanner, Applying
>> same iterator using scanner and batchsacnner, through scanner getting
>> returned entries but getting no entries returned while using 
>> batchscanner.
>> So can you please explain.

View raw message