Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 70535 invoked from network); 30 Aug 2008 16:23:28 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 30 Aug 2008 16:23:28 -0000 Received: (qmail 10423 invoked by uid 500); 30 Aug 2008 16:23:20 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 10390 invoked by uid 500); 30 Aug 2008 16:23:20 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 10379 invoked by uid 99); 30 Aug 2008 16:23:20 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 30 Aug 2008 09:23:20 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [69.12.216.205] (HELO theronge.com) (69.12.216.205) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 30 Aug 2008 16:22:22 +0000 Received: from [192.168.1.100] (c-98-212-230-11.hsd1.il.comcast.net [98.212.230.11]) (Authenticated sender: mronge) by theronge.com (Postfix) with ESMTP id 173206434A for ; Sat, 30 Aug 2008 09:22:51 -0700 (PDT) Message-Id: <2C396FF4-74E1-4E62-8408-36C4C9DA8DAA@theronge.com> From: Matt Ronge To: java-user@lucene.apache.org In-Reply-To: <200808301313.20821.paul.elschot@xs4all.nl> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v926) Subject: Re: Pre-filtering for expensive query Date: Sat, 30 Aug 2008 11:22:50 -0500 References: <4EC81374-F485-4A2D-A178-8838235AE6C1@theronge.com> <200808301313.20821.paul.elschot@xs4all.nl> X-Mailer: Apple Mail (2.926) X-Virus-Checked: Checked by ClamAV on apache.org On Aug 30, 2008, at 6:13 AM, Paul Elschot wrote: > Op Saturday 30 August 2008 03:34:01 schreef Matt Ronge: >> Hi all, >> >> I am working on implementing a new Query, Weight and Scorer that is >> expensive to run. I'd like to limit the number of documents I run >> this query on by first building a candidate set of documents with a >> boolean query. Once I have that candidate set, I was hoping I could >> build a filter off of it, and issue that along with my expensive >> query. However, after reading the code I see that filtering is done >> during the search, and not before hand. > > Correct. I suppose you mean the filtering code in IndexSearcher? Yes, that's exactly what I mean. > >> So my initial boolean query >> won't help in limiting the number of documents scored by my expensive >> query. > > The trick of filtering is the use of skipTo() on both the filter and > the scorer to skip superfluous work as much as possible. > So when you make your scorer implement skipTo() efficiently, > filtering it should reduce the amount of scoring done. > > Implementing skipTo() efficiently is normally done by using > TermScorer.skipTo() on the leafs of a scorer structure. So, > in case you implement your own TermScorer, take a serious > look at TermScorer.skipTo(). > > Normally, score value computations are not the bottleneck, > but accessing the index is, and this is where skipTo() does > the real work. At the moment avoiding score value computations > is a nice extra. I was not aware of this. Where can I find the code that uses the filter to determine what values to feed to skipTo (I'm trying to get a better understand of the Lucene source)? > > >> Or should I just implement something myself in a custom scorer? > > In case you have a better way than skipTo(), or something > to improve on this issue to allow a Filter as clause to BooleanQuery: > https://issues.apache.org/jira/browse/LUCENE-1345 > let us know. Thanks, if the skipTo approach doesn't work, I'll take a look at this. -- Matt --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org