Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 77681 invoked from network); 7 Jul 2006 04:30:18 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 7 Jul 2006 04:30:18 -0000 Received: (qmail 44684 invoked by uid 500); 7 Jul 2006 04:30:13 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 44647 invoked by uid 500); 7 Jul 2006 04:30:12 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 44636 invoked by uid 99); 7 Jul 2006 04:30:12 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Jul 2006 21:30:12 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=HTML_MESSAGE X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy) Received: from [203.199.83.209] (HELO mailpro5.rediffmailpro.com) (203.199.83.209) by apache.org (qpsmtpd/0.29) with SMTP; Thu, 06 Jul 2006 21:30:11 -0700 Received: (qmail 4383 invoked from network); 7 Jul 2006 04:27:50 -0000 Received: from unknown (HELO Amit) (59.181.103.41) by mailserver with SMTP; 7 Jul 2006 04:27:50 -0000 Reply-To: From: "Amit" To: "Erick Erickson" , Subject: RE: Function writing using lucene Date: Fri, 7 Jul 2006 09:59:45 +0530 Message-ID: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_0025_01C6A1AC.13370EA0" X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0) In-Reply-To: <359a92830607050602v400be461x87cb01932e89f6ce@mail.gmail.com> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Importance: Normal X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N ------=_NextPart_000_0025_01C6A1AC.13370EA0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Thanks Erick for reply.it will help us. Regards, Amit -----Original Message----- From: Erick Erickson [mailto:erickerickson@gmail.com] Sent: Wednesday, July 05, 2006 6:32 PM To: java-user@lucene.apache.org; amitk@techepoch.com Subject: Re: Function writing using lucene Amit: You can make arbitrarily complex boolean clauses, see BooleanQuery. For that, you don't need a filter. You can add boolean clauses with MUST, SHOULD and MUST NOT (AND, OR, NOT). Filters are for restricting queries that create (under the covers) a large BooleanQuery. You shouldn't think about filters unless there is a demonstrated need. You'll know that when you get a TooManyClauses exception . There is a discussion of this in the book Lucene In Action, a book that I highly recommend. For instance, something like should get the documents that have both "lucene" and "apache" in the contents field: BooleanQuery bq = new BooleanQuery; bq.add(new TermQuery(new Term("content", "lucene")), BooleanClause.Occur.MUST); bq.add(new TermQuery(new Term("content", "apache")), BooleanClause.Occur.MUST); Hits hits = Searcher.search(bq); For that matter, the QueryParser will do all this for you if you give it the right string. It's up to you whether you let QueryParser construct the clause for you or construct the BooleanQuery yourself. Note that you can sort however you want using the Searcher.search(Query, Sort) form of the search call. Sorting takes time, though..... I don't understand why you care about "filtering out document IDs". You submit a query and get a Hits object back. Then you fetch the Document from the Hits object that contains your data. Document IDs are an integral part of the document (and assigned by Lucene at index time). I have a hard time imagining why you'd even want to try to filter them out, I'm not sure the question makes sense in the Lucene context. The default return order is by relevance, doc id has nothing to do with it. You can certainly write a custom filter, but be really sure you need it first. Wildcard queries and prefix queries are prime candidates for filters in my experience. If you haven't gotten a copy of Lucene In Action, I strongly recommend that you do. It explains a LOT about how Lucene works and should be used. Also, get a copy of Luke to examine your indexes and allow you to play around with the query syntax. It'll save you a LOT of time and effort. Hope this helps Erick ------=_NextPart_000_0025_01C6A1AC.13370EA0--