lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-4366) Small speedups for BooleanScorer
Date Fri, 14 Sep 2012 17:02:07 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-4366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13455940#comment-13455940
] 

Michael McCandless commented on LUCENE-4366:
--------------------------------------------

I agree it'd be nice to not score MUST_NOT clauses for both BS and
BS2.  Really crazy that we do that!

Separately I think BS should handle MUST clauses in some cases...

There are several optimizations:

  * When collecting first clause per-chunk, don't bother checking
    whether the bucket is stale since it will always be (saves an if
    per hit).  I also sort by smallest first-docID first (proxy for
    highest docFreq) so this saved-if-per-hit has the most impact.

  * Use int[] instead of linked list to record filled buckets.

  * Don't call .score for prohibited hits.

  * Don't enroll prohibited hits into the "live" buckets list.

  * Don't call .score for a hit that was already prohibited due to a
    previous clause (I sort prohibited clauses first for this
    reason).

  * Use "boolean prohibited" instead of int bitmask (not sure this
    matters but it was confusing to use a bitmask).

I suspect the saved if per collect is most of the gains (the results
above didn't test MUST_NOT clauses ... still TODO).  But ... I'm getting
erratic results when performance testing (and can't reproduce the above
results on my current patch...).  Not sure what's up...

                
> Small speedups for BooleanScorer
> --------------------------------
>
>                 Key: LUCENE-4366
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4366
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>         Attachments: LUCENE-4366.patch, LUCENE-4366.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message