accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith Turner (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-759) remove priority setting for scan-time iterators
Date Mon, 10 Sep 2012 17:25:07 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452156#comment-13452156
] 

Keith Turner commented on ACCUMULO-759:
---------------------------------------

One other thought I had, was to use negative priorities.  I thought this was kinda ugly. 
Negative numbers are currently not allowed.   Could continue to disallow negative numbers
for table iterators and allow them for scan time iterators.  Negative priorities would always
be interpreted on the server side as "after all table iterators".  If negative numbers were
used, we would want to do that in such a way that the user never entered a negative number.

Could add the following to IteratorSetting : 

{code:java}
  public static final int MAX_TABLE_PRIORITY = Integer.MAX_VALUE;
{code}

The code example I gave above would change to the following.

{code:java}
//assume conn and scanner are initialized somewhow, just want to show their type
   Connector conn;
   Scanner scanner;

   int tableMax = IteratorSetting.MAX_TABLE_PRIORITY + 1; //this is effectively Integer.MIN_VALUE
   
   scanner.addScanIterator(new IteratorSetting(tableMax++, "foo1", ".org.bar.FooIter));
   scanner.addScanIterator(new IteratorSetting(tableMax++, "foo2", ".org.bar.BarIter));
{code}

The above is ugly, but I just want to show my thought process.  I think the code below is
much less offensive from a user perspective.  It does something screwy with negative numbers
behind the scenes, but that is hidden from the user.

{code:java}
class ScanIteratorSetting extends IteratorSetting {

   public ScanIteratorSetting(String name, String iteratorClass)
     super(Integer.MIN_VALUE, name, iteratorClass);
   }

   public ScanIteratorSetting(ScanIteratorSetting predecessor, String name, String iteratorClass)
     super(predecessor.priority+1, name, iteratorClass);
   }
{code}

So now the code would look like this.

{code:java}
//assume conn and scanner are initialized somewhow, just want to show their type
   Connector conn;
   Scanner scanner;

   
   ScanIteratorSetting is1 = new ScanIteratorSetting("foo1", ".org.bar.FooIter); //comes after
all table iterators   
   scanner.addScanIterator(is1);

   ScanIteratorSetting is2 = new ScanIteratorSetting(is1, "foo2", ".org.bar.BarIter); //comes
after all table iterators and after foo1
   scanner.addScanIterator(is2);
{code}

Can we make the code for chaining iterators more compact and intuitive?
                
> remove priority setting for scan-time iterators
> -----------------------------------------------
>
>                 Key: ACCUMULO-759
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-759
>             Project: Accumulo
>          Issue Type: Improvement
>            Reporter: Adam Fuchs
>              Labels: newbie
>
> Iterators have a priority setting that allows a user to order iterators arbitrarily.
However that priority is an integer that doesn't directly convey the iterator's relationship
to other iterators. I would postulate that nobody has ever needed to sneak in a scan-time
iterator underneath a configured table iterator (please let me know if I'm wrong about this),
and the effect of doing so is not easy to calculate. Many people have chosen a bad iterator
priority and seen commutativity problems with previously configured iterators.
> I propose that we use more of an agglomerative approach to configuring scan-time iterators,
in which the order of the iterator tree is the same order in which the addScanIterator method
is called, and all scan-time iterators apply after the configured iterators apply. The change
to the API should just be to remove the priority number, and the existing IteratorSetting
constructor and accessors should be deprecated.
> With this change, we can think of an iterator as more of a functional modification to
a data set, as in T' = f(T) or T'' = g(f(T)). This should make it easier for developers to
use iterators correctly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message