lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <>
Subject [jira] [Commented] (LUCENE-4366) Small speedups for BooleanScorer
Date Fri, 14 Sep 2012 17:02:07 GMT


Michael McCandless commented on LUCENE-4366:

I agree it'd be nice to not score MUST_NOT clauses for both BS and
BS2.  Really crazy that we do that!

Separately I think BS should handle MUST clauses in some cases...

There are several optimizations:

  * When collecting first clause per-chunk, don't bother checking
    whether the bucket is stale since it will always be (saves an if
    per hit).  I also sort by smallest first-docID first (proxy for
    highest docFreq) so this saved-if-per-hit has the most impact.

  * Use int[] instead of linked list to record filled buckets.

  * Don't call .score for prohibited hits.

  * Don't enroll prohibited hits into the "live" buckets list.

  * Don't call .score for a hit that was already prohibited due to a
    previous clause (I sort prohibited clauses first for this

  * Use "boolean prohibited" instead of int bitmask (not sure this
    matters but it was confusing to use a bitmask).

I suspect the saved if per collect is most of the gains (the results
above didn't test MUST_NOT clauses ... still TODO).  But ... I'm getting
erratic results when performance testing (and can't reproduce the above
results on my current patch...).  Not sure what's up...

> Small speedups for BooleanScorer
> --------------------------------
>                 Key: LUCENE-4366
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>         Attachments: LUCENE-4366.patch, LUCENE-4366.patch

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message