lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <>
Subject [jira] Updated: (LUCENE-2686) DisjunctionSumScorer should not call .score on sub scorers until consumer calls .score
Date Mon, 11 Oct 2010 15:03:34 GMT


Michael McCandless updated LUCENE-2686:

    Attachment: LUCENE-2686.patch

New patch attached.

I added Koji's test as a unit test, that fails on trunk but passes now
with the patch.

The new scorer is definitely slower if you do want scoring, however,
it's actually uncommon for this scorer to be used... because BQ will
use BS when the query is all SHOULD clauses (plus up to 32 NOT).
Only if there is also 1 or more MUST clauses will this scorer be used,
or, if the collector does not support out-of-order scoring.  I had to
hack BQ to temporarily turn off BS to test this.

Insanely, the constant score BQ rewrite does use this scorer, because
when ConstantScorer invokes the sub-scorer it requires in-order
scoring.  So ConstantScoreQuery and constant score BQ rewrite for the
MTQs will always see the speedup from this patch.

So given that it's rare to actually use the scorer, I think ~8% gain
as seen by "default" usage makes it worthwhile net/net.

Longer term... we should probably change the Weight.scorer API so that
you notify it up-front if you need to do scoring.  This way
we can specialize (manually or automatically) the best class, instead
of waiting to see whether the .score() method is invoked per hit.

> DisjunctionSumScorer should not call .score on sub scorers until consumer calls .score
> --------------------------------------------------------------------------------------
>                 Key: LUCENE-2686
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Search
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 3.1, 4.0
>         Attachments: LUCENE-2686.patch, LUCENE-2686.patch,
> Spinoff from java-user thread "question about Scorer.freq()" from Koji...
> BooleanScorer2 uses DisjunctionSumScorer to score only-SHOULD-clause boolean queries.
> But, this scorer does too much work for collectors that never call .score, because it
scores while it's matching.  It should only call .score on the subs when the caller calls
its .score.
> This also has the side effect of messing up advanced collectors that gather the freq()
of the subs (using LUCENE-2590).

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message