lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ion Barcan <ion.bar...@gmail.com>
Subject PhraseQuery in BooleanQuery not working properly in 2.9.0
Date Tue, 13 Oct 2009 13:23:22 GMT
Hello,

With the new Lucene 2.9.0 (on a newly built index of approx. 30
million documents) running BooleanQueries containing PhraseQuery does
not work properly. I've verified this on both optimized and
unoptimized index versions.

For example:

lucli> count field1:"john doe"
Searching for: field1:"john doe"
496 total documents

lucli> count +(field1:"john doe")
Searching for: +field1:"john doe"
496 total documents

lucli> count +(field1:"john doe" field1:"john doe")
Searching for: +(field1:"john doe" field1:"john doe")
5 total documents

lucli> count +(+field1:"john doe" field1:"john doe")
Searching for: +(+field1:"john doe" field1:"john doe")
496 total documents

lucli> count +(field1:"john doe" field2:UnmatchedValue)
Searching for: +(field1:"john doe" field2:UnmatchedValue)
5 total documents

lucli> count +(+field1:"john doe" field2:UnmatchedValue)
Searching for: +(+field1:"john doe" field2:UnmatchedValue)
496 total documents

This was also verifiable when I searched using TopScoreDocCollector(N,
true|false), with the call using docsScoredInOrder=false producing
incorrect results.

While debugging I've noticed that for the BooleanQuery containing at
least one MUST clause BooleanScorer2 is used and this produces the
correct number of results, while for BooleanQuery that don't contain
any MUST clause BooleanScorer.score(Collector, int, int) selects up to
a certain number of docs and then it exits prematurely.

Is this behaviour normal? This used to work in Lucene 2.4.x.

I've noticed another user mentioning a similar behaviour
(http://mail-archives.apache.org/mod_mbox/lucene-java-user/200910.mbox/%3C20091008121147.107a8589@pc-4176.kl.dfki.de%3E),
but in my case it's a newly built index, not one that was migrated
from 2.4 to 2.9.

Thanks,
Ionut

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message