Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 1703 invoked from network); 9 Jun 2006 07:21:29 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 9 Jun 2006 07:21:29 -0000 Received: (qmail 54724 invoked by uid 500); 9 Jun 2006 07:21:23 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 54692 invoked by uid 500); 9 Jun 2006 07:21:22 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 54681 invoked by uid 99); 9 Jun 2006 07:21:22 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Jun 2006 00:21:22 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: neutral (asf.osuosl.org: local policy) Received: from [169.229.70.167] (HELO rescomp.berkeley.edu) (169.229.70.167) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Jun 2006 00:21:22 -0700 Received: by rescomp.berkeley.edu (Postfix, from userid 1007) id 02D935B763; Fri, 9 Jun 2006 00:20:58 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by rescomp.berkeley.edu (Postfix) with ESMTP id F34E27F403 for ; Fri, 9 Jun 2006 00:20:58 -0700 (PDT) Date: Fri, 9 Jun 2006 00:20:58 -0700 (PDT) From: Chris Hostetter To: java-user@lucene.apache.org Subject: Re: return single document from duplicated documents in index In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N take a look at the HitCollector and Filter APIs .. you can impliment any logic you want in either of those classes to restrict what results you get -- and the FieldCache gives you an easy way to check what the value of a particular indexed field is. storing the mappings of field value to "best" matching doc is up to you. : Date: Fri, 9 Jun 2006 14:27:54 +0800 : From: Alan Boo : Reply-To: java-user@lucene.apache.org : To: java-user@lucene.apache.org : Subject: return single document from duplicated documents in index : : g'day, : : i've two questions. : : let's say the following is my index with 2 field : title and contents : : title contents : beer beer is good : beer beer is good : cat sleepy : dog what a cute one! : beer beer is good : : if i do a search on beer, i'll get 3 result returns and three of them : are the same, is there any elegant way to workaround so the hitlist : contains only one document instead of 3? (and i want the index : contains duplicate records for some reason) : : ____ : : : yet, another senario, : : title contents location : date : beer beer is god /usr/data/beer : 111111111 : beer beer is good /usr/data/beer : 222222255 : cat sleepy /usr/data/cat : 222222224 : dog what a cute one! /usr/data/dog 555555555 : beer beer is good /usr/data/beer2 : 999999999 : : i want to do a search on beer that will returns only the 2 result on : beer on different location and it must be the latest. is there any way : to do that? : : regards, : : alan : : --------------------------------------------------------------------- : To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org : For additional commands, e-mail: java-user-help@lucene.apache.org : -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org