lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Elschot <>
Subject Re: Queries with only non required terms: not as OR?
Date Wed, 03 Mar 2004 20:00:30 GMT

On Wednesday 03 March 2004 18:47, Doug Cutting wrote:
> Paul Elschot wrote:
> > I read a bit into the source code and I found this comment at
> > BooleanQuery.scorer():
> >
> > // Also, at this point a
> > // BooleanScorer cannot be embedded in a ConjunctionScorer, as the hits
> > // from a BooleanScorer are not always sorted by document number (sigh)
> > // and hence BooleanScorer cannot implement skipTo() correctly, which is
> > // required by ConjunctionScorer.
> >
> > The test function I used assumes that documents will be collected in
> > order. Could this be the source of the problem?
> It could be.

I'll make the test search in the array of doc nrs that it receives now.

> I only realized recently that BooleanScorer does some local reordering
> of document numbers passed to the HitCollector.  There's no easy fix.

I assume it works correctly, so why fix it, except for speed?

> When I get a chance I intend to rewrite BooleanScorer to fix this and to
> correctly implement skipTo().  The result will be somewhat slower for

You might find the previously posted test code to be a test case for
that. It's nice to see a possible real use this :) even though I was doing
something wrong.

> some queries, especially those with a large number of optional terms,
> but will sometimes be faster when it's nested in other queries, and
> skipTo() can be leveraged.  I would like to get to this in next few

When the two cases can be distinguished, you might try and leave the current
method in for the large number of optional terms.
I like speed, and I guess I'm not the only one. 
Also, with the term vectors in CVS one might expect more queries with optional
terms resulting from relevance feedback methods.

> weeks, and then make a 1.4 RC1 release.  The fix will take a few days
> work.  If I can find someone to fund the work it may happen sooner.
> Right now other projects have higher priority for me.

Lucene is moving fast enough for me...

Thanks a lot,

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message