lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Smiley (JIRA)" <>
Subject [jira] [Created] (LUCENE-4418) Improve RecursivePrefixTreeFilter's performance heuristic tunables
Date Sun, 23 Sep 2012 15:51:07 GMT
David Smiley created LUCENE-4418:

             Summary: Improve RecursivePrefixTreeFilter's performance heuristic tunables
                 Key: LUCENE-4418
             Project: Lucene - Core
          Issue Type: Improvement
          Components: modules/spatial
            Reporter: David Smiley
            Assignee: David Smiley
            Priority: Minor

RecursivePrefixTreeFilter recursively decomposes grid cells until it gets to a threshold grid
level (e.g. 4 away from max levels), at which point it does a brute force scan because it's
faster once the number of terms is smaller.  So if max levels is 10, then if the threshold
is 4 then it will switch to scanning at 6.  Ideally, the filter would know exactly how many
terms there are in that grid -- i.e. given a hi & lo term, determine how many indexed
terms are in-between without actually iterating to find out.  

Instead, it could use the # docs that a grid cell has as a heuristic.  It's not perfect but
I think its much better because it's dynamic based on density of actual indexed data.  It's
not perfect because many documents could refer to the same indexed point, or few documents
with multi-valued data could refer to many indexed points.

Before I do this, I need to re-invigorate my testing efforts so I can come up with a default
threshold.  And it's also dependent on things like query shape complexity. 

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message