accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Fuchs <>
Subject Re: ROW ID Iterator - sanity check
Date Sun, 20 May 2012 17:57:18 GMT
Since you changed the iterator method to create a new RowIdIterator based
on the old scanner, and the old scanner remembers its scan iterator
configuration, each time you call iterate you end up duplicating the call
to setScanIterator. I would instead do that configuration of the scanner
outside of the RowIdIterator before you construct the first one.

SortedKeyValueIterator is the basic interface that we use for server-side
iterator implementation. Every iterator that operates in the "iterator
tree" is a SortedKeyValueIterator. Bill was saying that you could write
your own iterator and add it to that iterator tree to take advantage of the
extra functionality that exists on the server side.

If you were to write a SortedKeyValueIterator, you would probably start out
with a WrappingIterator and override the next() and seek() methods so that
they can skip way ahead when you ask for the next row. Here's what that
would look like:

import java.util.Collection;

import org.apache.accumulo.core.iterators.WrappingIterator;

public class RowEnumerationIterator extends WrappingIterator {

  boolean notFinished = false;
  Range originalRange;
  Collection<ByteSequence> originalColumns;
  boolean originalColumnsInclusive;

  public void seek(Range r, Collection<ByteSequence> columns, boolean
    notFinished = true;
    // keep track of the original seek parameters so that we can reference
them when we reseek later
    originalRange = r;
    originalColumns = columns;
    originalColumnsInclusive = columnsInclusive;, columns, columnsInclusive);

  public boolean hasTop()
    // check our local state first, then defer to the super class
    return notFinished && super.hasTop();

  public void next()
    // create a range starting at the next possible row and continuing to
    Range followingRange = new
    // intersect that new range with the original range given to our seek
    Range intersectedRange = originalRange.clip(followingRange, true);
    // check to see if we're past the end of the original range
    if(intersectedRange == null)
      notFinished = false;
      getSource().seek(intersectedRange, originalColumns,

  Value emptyValue = new Value(new byte[0]);
  public Value getTopValue()
    // replace the value with an empty value to save bandwidth
    return emptyValue;

You'll need to add this class to the dynamic classpath (i.e. put it in a
jar in the lib/ext directory of all the tablet servers), and then reference
it like you did the SortedKeyIterator below.


On Sun, May 20, 2012 at 12:49 PM, David Medinets

> Seaching through the source for SortedKeyIterator shows that it is
> used in 15 files. The FindMax class seems to be a fine example of its
> use:
>    IteratorSetting cfg = new IteratorSetting(Integer.MAX_VALUE,
> SortedKeyIterator.class);
>    scanner.addScanIterator(cfg);
> That seems simple enough but when I change my code according I get a
> message:
>  Exception in thread "main" java.lang.IllegalArgumentException:
> Iterator name is already in use SKI98
>        at
> org.apache.accumulo.core.client.impl.ScannerOptions.addScanIterator(
>        at com.codebits.accumulo.RowIdIterator.<init>(
> My code change was trivial:
>        Iterator<Entry<Key, Value>> iterator = null;
>        public RowIdIterator(Scanner scanner) {
>                super();
>                this.scanner = scanner;
>             IteratorSetting cfg = new IteratorSetting(Integer.MAX_VALUE,
> "SKI98", SortedKeyIterator.class);
> 22 -->      scanner.addScanIterator(cfg);
>                this.iterator = scanner.iterator();
>        }
>        @Override
>        public String next() {
>                 Entry<Key, Value> entry =;
>                 return entry.getKey().getRow().toString();
>        }
> As you can see its name is unlikely to be in use.

View raw message