lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: ConjunctionScorer.doNext() overstays?
Date Thu, 01 Mar 2012 13:31:57 GMT
Hmm, the tradeoff is an added per-hit check (doc != NO_MORE_DOCS), vs
the one-time cost at the end of calling advance(NO_MORE_DOCS) for each
sub-clause?  I think in general this isn't a good tradeoff?

Ie what about the case where we and high-freq, and similarly freq'd,
terms together?  Then, the per-hit check will at some point dominate?

It's valid to pass NO_MORE_DOCS to DocsEnum.advance.

Mike McCandless

On Thu, Mar 1, 2012 at 7:22 AM, mark harwood <> wrote:
> I got round to some benchmarking of this change on Wikipedia content which shows a small
> Aside from the small performance gain to be had, it just feels more logical if ConjunctionScorer
does not issue sub scorers with a request to advance to "NO_MORE_DOCS".
> ----- Original Message -----
> From: mark harwood <>
> To: "" <>
> Cc:
> Sent: Thursday, 1 March 2012, 9:39
> Subject: ConjunctionScorer.doNext() overstays?
> Due to the odd behaviour of a custom Scorer of mine I discovered ConjunctionScorer.doNext()
could loop indefinitely.
> It does not bail out as soon as any scorer.advance() call it makes reports back "NO_MORE_DOCS".
Is there not a performance optimisation to be gained in exiting as soon as this happens?
> At this stage I cannot see any point in continuing to advance other scorers - a quick
look at TermScorer suggests that any questionable calls made by ConjunctionScorer to advance
to NO_MORE_DOCS receives no special treatment and disk will be hit as a consequence.
> I added an extra condition to the while loop on the 3.5 source:
>     while ((doc != NO_MORE_DOCS)  && ((firstScorer = scorers[first]).docID()
< doc)) {
> and Junit tests passed.I haven't been able to benchmark performance improvements but
it looks like it would be sensible to make the change anyway.
> Cheers,
> Mark
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message