lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Fertig" <>
Subject RE: Search returning documents matching a NOT range
Date Wed, 17 Nov 2010 21:50:12 GMT
I noticed there is still no JIRA ticket for this, do we have any type on consensus on how this
issue will/will not be resolved?  

If MultiSearcher and and MultiReader do not give the same results, I would think one would
be considered "broken" and/or possibly "unfixable".  Is MultiSearcher going to be deprecated
and removed?  Should one always use MultiReader in order to get accurate results?

-----Original Message-----
From: Robert Muir [] 
Sent: Wednesday, November 10, 2010 1:19 PM
Subject: Re: Search returning documents matching a NOT range

On Wed, Nov 10, 2010 at 1:04 PM, Uwe Schindler <> wrote:
> I know where the bug is...
> The problem has nothing to to with MultiSearcher at all, its just the rewritten query.
Because (as Robert said) MultiSearcher rewrites per index, the rewritten query is different
for each sub-index. The problems, Robert mentioned only affect scoring (which is different).
As Range is ConstantScore it should still return same documents.

I disagree, it really is because of how MultiSearcher combine()s the
queries from the subsearchers.

and the problems again don't affect scoring:

* you might hit boolean max clause count with AUTO if you use
MultiSearcher (e.g. 5 subs return 250 terms each and the total unique
across all in combine() is > 1024). In this case the query won't work
at all, but it works with MultiReader. And this is really bad, because
i think we document you will not hit this boolean max clause count
with AUTO, and we default to it for this reason.

* fuzzyquery etc will return completely different documents, or if you
manually use TopTermsRewrite for some other query.

none of these problems happen if you use MultiReader. This is why i
suggested removing MultiSearcher, at the least, if you use this thing
i think you should explicitly set FILTER_REWRITE.

> The problem here seems to be that in the first searcher der Boolean rewrites to an empty
BooleanQuery(), which is fine for ranges that hit no terms at all (this is also supported
and works). The problem seems to be that a SHOULD_NOT clause on an empty Boolean query produces
this bug!

yes, i'm not sure this is the "problem", i think the problem is in
combine(), but this is the root cause.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message