lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-4571) speedup disjunction with minShouldMatch
Date Fri, 22 Mar 2013 14:33:15 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13610330#comment-13610330
] 

Robert Muir commented on LUCENE-4571:
-------------------------------------

Here's my performance results.
{noformat}
                    Task   QPS trunk      StdDev   QPS patch      StdDev                Pct
diff
                PKLookup      251.82      (1.3%)      239.58      (2.5%)   -4.9% (  -8% -
  -1%)
     Low4MinShouldMatch0        8.34      (6.0%)        8.62      (6.8%)    3.3% (  -8% -
  17%)
     Low1MinShouldMatch0        1.90      (3.9%)        1.99      (6.0%)    4.8% (  -4% -
  15%)
     Low2MinShouldMatch0        2.41      (4.3%)        2.53      (6.3%)    4.8% (  -5% -
  16%)
     Low3MinShouldMatch0        3.94      (4.4%)        4.13      (6.3%)    5.0% (  -5% -
  16%)
     HighMinShouldMatch0        1.61      (3.7%)        1.70      (6.0%)    5.1% (  -4% -
  15%)
     HighMinShouldMatch2        1.68      (4.1%)        1.94      (5.0%)   15.6% (   6% -
  25%)
     Low1MinShouldMatch2        1.99      (4.4%)        2.39      (5.4%)   19.8% (   9% -
  30%)
     Low2MinShouldMatch2        2.57      (5.0%)        3.27      (6.4%)   27.2% (  15% -
  40%)
     HighMinShouldMatch3        1.71      (4.3%)        2.48      (5.6%)   45.0% (  33% -
  57%)
     Low3MinShouldMatch2        4.24      (5.2%)        6.21      (8.4%)   46.3% (  31% -
  63%)
     Low1MinShouldMatch3        2.05      (4.6%)        3.25      (6.9%)   58.9% (  45% -
  73%)
     HighMinShouldMatch4        1.74      (4.5%)        3.52      (8.2%)  102.2% (  85% -
 120%)
     Low2MinShouldMatch3        2.67      (5.4%)        5.40     (10.7%)  102.3% (  81% -
 125%)
     Low1MinShouldMatch4        2.08      (4.9%)        5.74     (14.2%)  175.3% ( 148% -
 204%)
     Low4MinShouldMatch2       10.18      (8.7%)       48.51     (16.3%)  376.5% ( 323% -
 439%)
     Low3MinShouldMatch3        4.50      (5.9%)       40.66     (26.7%)  804.3% ( 728% -
 889%)
     Low4MinShouldMatch3       10.22      (8.7%)      151.37     (62.2%) 1380.9% (1204% -
1590%)
     Low2MinShouldMatch4        2.71      (5.6%)       45.36     (43.9%) 1574.3% (1443% -
1720%)
     Low4MinShouldMatch4       10.22      (8.7%)      222.35     (93.0%) 2075.6% (1816% -
2383%)
     Low3MinShouldMatch4        4.50      (5.8%)      206.42    (197.7%) 4486.1% (4047% -
4979%)
{noformat}
                
> speedup disjunction with minShouldMatch 
> ----------------------------------------
>
>                 Key: LUCENE-4571
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4571
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/search
>    Affects Versions: 4.1
>            Reporter: Mikhail Khludnev
>         Attachments: LUCENE-4571.patch, LUCENE-4571.patch, LUCENE-4571.patch, LUCENE-4571.patch,
LUCENE-4571.patch, LUCENE-4571.patch
>
>
> even minShouldMatch is supplied to DisjunctionSumScorer it enumerates whole disjunction,
and verifies minShouldMatch condition [on every doc|https://github.com/apache/lucene-solr/blob/trunk/lucene/core/src/java/org/apache/lucene/search/DisjunctionSumScorer.java#L70]:
> {code}
>   public int nextDoc() throws IOException {
>     assert doc != NO_MORE_DOCS;
>     while(true) {
>       while (subScorers[0].docID() == doc) {
>         if (subScorers[0].nextDoc() != NO_MORE_DOCS) {
>           heapAdjust(0);
>         } else {
>           heapRemoveRoot();
>           if (numScorers < minimumNrMatchers) {
>             return doc = NO_MORE_DOCS;
>           }
>         }
>       }
>       afterNext();
>       if (nrMatchers >= minimumNrMatchers) {
>         break;
>       }
>     }
>     
>     return doc;
>   }
> {code}
> [~spo] proposes (as well as I get it) to pop nrMatchers-1 scorers from the heap first,
and then push them back advancing behind that top doc. For me the question no.1 is there a
performance test for minShouldMatch constrained disjunction. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message