lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <>
Subject [jira] Commented: (LUCENE-1252) Avoid using positions when not all required terms are present
Date Thu, 28 May 2009 09:50:45 GMT


Michael McCandless commented on LUCENE-1252:

I agree, we would want to do a ConjunctionScorer first, w/ all 4 terms, and then 2nd a PhraseScorer
for each of the two phrases, but somehow they should be bound together such that a single
TermPositions enumerator is shared in the two places for each term.

I think doing the additional filtering in collect is a little late -- there could be a number
of such "more expensive constraints" to apply, depending on the query.

But, eg since we're talking about how to fix the up-front sort logic in ConjunctionScorer...
you could imagine asking the PhraseQuery for its scorer, and getting back 2 AND'd scorers
(cheap & expensive) that under-the-hood are sharing a single TermPositions enum, and then
ConjunctionScorer would order all such scorers it got so that all cheap ones are checked first
and only once they agree on a doc are the expensive scorers check.

> Avoid using positions when not all required terms are present
> -------------------------------------------------------------
>                 Key: LUCENE-1252
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Wish
>          Components: Search
>            Reporter: Paul Elschot
>            Priority: Minor
> In the Scorers of queries with (lots of) Phrases and/or (nested) Spans, currently next()
and skipTo() will use position information even when other parts of the query cannot match
because some required terms are not present.
> This could be avoided by adding some methods to Scorer that relax the postcondition of
next() and skipTo() to something like "all required terms are present, but no position info
was checked yet", and implementing these methods for Scorers that do conjunctions: BooleanScorer,
PhraseScorer, and SpanScorer/NearSpans.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message