Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 71212 invoked from network); 10 Nov 2010 18:04:49 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 10 Nov 2010 18:04:49 -0000 Received: (qmail 71743 invoked by uid 500); 10 Nov 2010 18:05:13 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 71707 invoked by uid 500); 10 Nov 2010 18:05:13 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 71699 invoked by uid 99); 10 Nov 2010 18:05:13 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Nov 2010 18:05:13 +0000 X-ASF-Spam-Status: No, hits=0.7 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [85.25.71.29] (HELO mail.troja.net) (85.25.71.29) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Nov 2010 18:05:07 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.troja.net (Postfix) with ESMTP id 58DD145FF1A for ; Wed, 10 Nov 2010 19:04:46 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at mail.troja.net Received: from mail.troja.net ([127.0.0.1]) by localhost (megaira.troja.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iYyAg3JKFUDF for ; Wed, 10 Nov 2010 19:04:41 +0100 (CET) Received: from VEGA (p5DD07B48.dip.t-dialin.net [93.208.123.72]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mail.troja.net (Postfix) with ESMTPSA id 26DFC45FF12 for ; Wed, 10 Nov 2010 19:04:41 +0100 (CET) From: "Uwe Schindler" To: References: <018f01cb7efd$f771ff20$e655fd60$@thetaphi.de> In-Reply-To: Subject: RE: Search returning documents matching a NOT range Date: Wed, 10 Nov 2010 19:04:50 +0100 Message-ID: <001301cb8101$c5400a50$4fc01ef0$@thetaphi.de> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Mailer: Microsoft Outlook 14.0 Thread-Index: AQIFBEM7LuaSiXYA0QxlDQQjTS0N4gEvKa5aASf9Jy0C0vYl3AI0wNULAUVUCl8BfNITVAK+ePHuAUjBD7OShrfn8A== Content-Language: de I know where the bug is... The problem has nothing to to with MultiSearcher at all, its just the = rewritten query. Because (as Robert said) MultiSearcher rewrites per = index, the rewritten query is different for each sub-index. The = problems, Robert mentioned only affect scoring (which is different). As = Range is ConstantScore it should still return same documents. The problem here seems to be that in the first searcher der Boolean = rewrites to an empty BooleanQuery(), which is fine for ranges that hit = no terms at all (this is also supported and works). The problem seems to = be that a SHOULD_NOT clause on an empty Boolean query produces this bug! I will try to reproduce with an easy testcase using a single IndexReader = and single IndexSearcher. The reason why you sometimes don't see the problem is, that auto rewrite = not always rewrites to BooleanQuery. ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: uwe@thetaphi.de > -----Original Message----- > From: Robert Muir [mailto:rcmuir@gmail.com] > Sent: Wednesday, November 10, 2010 1:25 PM > To: java-user@lucene.apache.org > Cc: David Fertig > Subject: Re: Search returning documents matching a NOT range >=20 > On Wed, Nov 10, 2010 at 7:00 AM, Robert Muir wrote: > > On Mon, Nov 8, 2010 at 6:45 AM, Ian Lea wrote: > >> This does seem extremely odd. David sent me a copy of his index = and > >> I've played around with it and also written a self-contained RAM > >> index program, below, that shows the same problem, namely that if = the > >> second index has 1000+ docs the one and only doc in the first index > >> is incorrectly matched if the search is done with a MultiSearcher. > >> In answer to Uwe's question, it works correctly if use a single > >> IndexSearcher on top of a MultiReader. > > >=20 > I played with your testcase, and it seems the rewrite() implementation = is > causing the strangeness you see. > for your query: author:aaa -pubdate:[aaa TO bbb], here are the = rewritten > forms: >=20 > MultiReader case: +author:aaa -ConstantScore(pubdate:[aaa TO bbb]) > MultiSearcher case: (+author:aaa -()) (+author:aaa - > ConstantScore(pubdate:[aaa TO bbb])) >=20 > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org