lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: Search returning documents matching a NOT range
Date Wed, 10 Nov 2010 18:04:50 GMT
I know where the bug is...

The problem has nothing to to with MultiSearcher at all, its just the rewritten query. Because
(as Robert said) MultiSearcher rewrites per index, the rewritten query is different for each
sub-index. The problems, Robert mentioned only affect scoring (which is different). As Range
is ConstantScore it should still return same documents.

The problem here seems to be that in the first searcher der Boolean rewrites to an empty BooleanQuery(),
which is fine for ranges that hit no terms at all (this is also supported and works). The
problem seems to be that a SHOULD_NOT clause on an empty Boolean query produces this bug!

I will try to reproduce with an easy testcase using a single IndexReader and single IndexSearcher.

The reason why you sometimes don't see the problem is, that auto rewrite not always rewrites
to BooleanQuery.

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Robert Muir [mailto:rcmuir@gmail.com]
> Sent: Wednesday, November 10, 2010 1:25 PM
> To: java-user@lucene.apache.org
> Cc: David Fertig
> Subject: Re: Search returning documents matching a NOT range
> 
> On Wed, Nov 10, 2010 at 7:00 AM, Robert Muir <rcmuir@gmail.com> wrote:
> > On Mon, Nov 8, 2010 at 6:45 AM, Ian Lea <ian.lea@gmail.com> wrote:
> >> This does seem extremely odd.  David sent me a copy of his index and
> >> I've played around with it and also written a self-contained RAM
> >> index program, below, that shows the same problem, namely that if the
> >> second index has 1000+ docs the one and only doc in the first index
> >> is incorrectly matched if the search is done with a MultiSearcher.
> >> In answer to Uwe's question, it works correctly if use a single
> >> IndexSearcher on top of a MultiReader.
> >
> 
> I played with your testcase, and it seems the rewrite() implementation is
> causing the strangeness you see.
> for your query: author:aaa -pubdate:[aaa TO bbb], here are the rewritten
> forms:
> 
> MultiReader case: +author:aaa -ConstantScore(pubdate:[aaa TO bbb])
> MultiSearcher case: (+author:aaa -()) (+author:aaa -
> ConstantScore(pubdate:[aaa TO bbb]))
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message