accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith Turner (Created) (JIRA)" <j...@apache.org>
Subject [jira] [Created] (ACCUMULO-403) Create general row selection iterator
Date Wed, 15 Feb 2012 16:38:59 GMT
Create general row selection iterator
-------------------------------------

                 Key: ACCUMULO-403
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-403
             Project: Accumulo
          Issue Type: New Feature
          Components: client, tserver
            Reporter: Keith Turner
            Assignee: Billie Rinaldi
             Fix For: 1.5.0


The WholeRowIterator support filtering rows that meet a certain criteria.  However it reads
the entire row into memory.  It is possible to efficiently select rows w/o reading them into
memory by using two iterators.  One iterator for selection, one for reading.  When its determined
that a row is not needed using the selection iterator, then seek the read iterator over the
row.  

This pattern could be made into an easy to use iterator that users extend.  The iterator could
have an abstract method that user implement to decide if they want to select or filter a row.
 Could look something like the following.


{noformat}

class RowSelectionIterator extends WrappingIterator {

   public abstract boolean selectRow(SortedKeyValueIterator row);

}

{noformat}


Below is a simple example of a row selection iterator that returns rows that have the columns
foo and bar.


{noformat}

class FooBarRowSelector extends  RowSelectionIterator {
   public boolean selectRow(SortedKeyValueIterator row){
      
      Text row = row.getTopKey().getRow();
      //seek instead of scanning, this more efficient for large rows w/ lots of columns...

      //if the row only has a few columns scanning is probably faster... also seeking the

      //columns in sorted order is more efficient.
      row.seek(Range.exact(row, 'bar');
      boolean sawBar = row.hasTop();

      row.seek(Range.exact(row, 'foo'));
      boolean sawFoo = row.hasTop();

      return sawBar && sawFoo;
   }
}

{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message