Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 50583 invoked from network); 21 May 2009 14:58:32 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 21 May 2009 14:58:32 -0000 Received: (qmail 8672 invoked by uid 500); 21 May 2009 14:58:43 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 8600 invoked by uid 500); 21 May 2009 14:58:43 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 8590 invoked by uid 99); 21 May 2009 14:58:43 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 May 2009 14:58:43 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [79.170.194.127] (HELO mail.roo10.com) (79.170.194.127) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 May 2009 14:58:32 +0000 Received: from [192.168.60.63] (unknown [78.105.13.3]) by mail.roo10.com (Postfix) with ESMTP id 305B15340BD8 for ; Thu, 21 May 2009 15:58:12 +0100 (BST) Subject: Re: Does Lucene fail fast on boolean queries? From: Joel Halbert To: java-user@lucene.apache.org In-Reply-To: <9ac0c6aa0905210743j59c1868dq2f9ba9cdde7ec5b4@mail.gmail.com> References: <1242909846.6688.17.camel@bohr> <9ac0c6aa0905210729p5196426emc25642a19af69595@mail.gmail.com> <1242916488.6688.33.camel@bohr> <9ac0c6aa0905210743j59c1868dq2f9ba9cdde7ec5b4@mail.gmail.com> Content-Type: text/plain Organization: SU3 Analytics Date: Thu, 21 May 2009 15:58:08 +0100 Message-Id: <1242917888.6688.41.camel@bohr> Mime-Version: 1.0 X-Mailer: Evolution 2.22.3.1 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Thx. We're not relying on the internal implementation, but I was wondering with respect to how efficient it is with respect to doing a boolean AND query. i.e. does clause precedence effect the efficiency of the query - so is X && Y faster than Y && X if there are fewer hits for X. From how you describe it it is equally efficient, either way. In particular we are trying to work out whether a particular numerical RangeQuery that needs to be AND'd with a TermQuery is fastest as: BooleanQuery(RangeQuery && TermQuery) or as a TermQuery which then has it's results filtered by processing each in turn. -----Original Message----- From: Michael McCandless Reply-To: java-user@lucene.apache.org To: java-user@lucene.apache.org Subject: Re: Does Lucene fail fast on boolean queries? Date: Thu, 21 May 2009 10:43:46 -0400 Well... scoring of AND queries currently is done doc-at-once. So Lucene will first step to doc 1 for Name, then ask age to skip to doc >= 1, will see that both have doc=1 and collect it. The same thing happens for doc=2. Then, Lucene will ask for the next doc of Name, which returns "false" (end of docs) and the loop breaks. OR, Lucene may drive the query in the opposite order (age and then Name), in which case it's the same through doc=2, but then Lucene asks age for the next doc, gets 5 back, then asks the Name iter to skip to doc >= 5, which returns false, and the loop breaks. So in fact "doc=5" can be asked for by Lucene. Also note that this is an internal implementation detail -- Lucene could easily change to do batch processing of AND'd queries in which case docs 5,10 could easily be iterated on. So I wouldn't "rely" on this in your app. Mike On Thu, May 21, 2009 at 10:34 AM, Joel Halbert wrote: > Thx. so, just to clarify, in the example I gave below... > > Lucene will search for documents matching on Name and find doc 1 and doc > 2. > Then it will search age and find docs 1, 2 and then break. It will not > go on to seek 5 and 10...? > > -----Original Message----- > From: Michael McCandless > Reply-To: java-user@lucene.apache.org > To: java-user@lucene.apache.org > Subject: Re: Does Lucene fail fast on boolean queries? > Date: Thu, 21 May 2009 10:29:57 -0400 > > Yes. > > As soon as Lucene sees that the Name docID iteration has ended, the > search will break. > > Mike > > On Thu, May 21, 2009 at 8:44 AM, Joel Halbert wrote: >> Hi, >> >> When Lucene performs a Boolean query, say: >> >> Field Name = Male >> AND >> Field Age = 30 >> >> assuming the resultant docs for each portion of the query were: >> >> Matching docs for: Name = 1,2 >> Matching docs for: Age = 1,2,5,10 >> >> Will Lucene stop searching for documents matching the Age term once it >> has found documents 1 and 2 ? >> i.e. since 5 and 10 will not be used will it stop searching at document >> number 2 ? >> >> Thx, >> Joel >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-user-help@lucene.apache.org >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org