Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 9036 invoked from network); 8 Feb 2004 18:35:45 -0000 Received: from daedalus.apache.org (HELO mail.apache.org) (208.185.179.12) by minotaur-2.apache.org with SMTP; 8 Feb 2004 18:35:45 -0000 Received: (qmail 4094 invoked by uid 500); 8 Feb 2004 18:35:28 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 4068 invoked by uid 500); 8 Feb 2004 18:35:28 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 4050 invoked from network); 8 Feb 2004 18:35:28 -0000 Received: from unknown (HELO five.zapatec.com) (66.117.150.100) by daedalus.apache.org with SMTP; 8 Feb 2004 18:35:28 -0000 Received: from rlx11.zapatec.com (rlx11.pr.zapatec.com [192.168.1.132]) by five.zapatec.com (Postfix) with ESMTP id AB4395D23 for ; Sun, 8 Feb 2004 10:35:32 -0800 (PST) Received: (from dror@localhost) by rlx11.zapatec.com (8.12.3/8.12.3/Submit) id i18IZWUQ015249 for lucene-user@jakarta.apache.org; Sun, 8 Feb 2004 10:35:32 -0800 (PST) (envelope-from dror) Date: Sun, 8 Feb 2004 10:35:32 -0800 From: Dror Matalon To: Lucene Users List Subject: Re: Search Refinement Approaches Message-ID: <20040208183532.GB11782@rlx11.zapatec.com> References: <00c501c3ecbd$5827e710$0401a8c0@antioch> <5952EE4C-58C3-11D8-A28B-000393A564E6@ehatchersolutions.com> <17640017652.20040207233235@hardan.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <17640017652.20040207233235@hardan.de> X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N Hi Ramy, Maybe I'm misunderstand the question but wouldn't creating a that ANDs the original query and the new one do what you want? so if the original query was foo bar and the refinment is blah create a new query that does: (foo bar) AND (bar) Seems a lot easier but maybe I'm missing something. Regards, Dror On Sat, Feb 07, 2004 at 11:32:35PM +0100, Ramy Hardan wrote: > Hi, > > Reviewing javadocs and previous posts, search refinement or 'search > within search' is best done with a Filter. To fill the Filter's BitSet > with the results of a search, a HitCollector is the obvious solution. > Unfortunately when using HitCollector I have to implement all the > functionality the Hits class usually provides myself. > > Is there an efficient way to search refinement preferably without > losing the Hits class? I can think of the following approaches: > > - Don't use Hits: collect all scores and document numbers with a > HitCollector and sort them by score after the search. Retrieve the > needed documents from IndexReader via document number. > - Use Hits: Briefly examining the source reveals this possiblilty: > subclass BitSet and override the boolean get(int bitIndex) method to > additionally set the bit at bitIndex in another BitSet. Use this > subclass in a Filter and initialize it with all ones (in the first > search). This way I can tell which documents are tested by the > IndexSearcher against the Filter by examining the second BitSet and > use it as a Filter for the refining search. Here's a scetch of this > for clarification: > > public class FilterBitSet extends BitSet { > private BitSet bitsForRefiningFilter; > > public boolean get( int bitIndex ) { > boolean result = super.get( bitIndex ); > if (result) bitsForRefiningFilter.set( bitIndex ); > return result; > } > } > > Is this really possible? (might be more of a question for dev) > > Last question about document numbers: > When and how exactly do they change? The javadoc states they change > upon addition and deletion. May I assume that a particular document > number is stable as long as it is not changed (deleted and added) > although other documents are added/deleted and optimize() is NOT > called? If yes, is this about to change in the foreseeable future? > > Thanks in advance > > Ramy > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org > For additional commands, e-mail: lucene-user-help@jakarta.apache.org > -- Dror Matalon Zapatec Inc 1700 MLK Way Berkeley, CA 94709 http://www.fastbuzz.com http://www.zapatec.com --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org