lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From qaz zaq <fortq...@yahoo.com>
Subject Re: Duplicates removal in search results
Date Thu, 14 Dec 2006 23:17:53 GMT
Thanks Erick,
Using termdocs/termenum should work. One of my concerns is the performance: the search results
could reach 100K, so the performance may be impacted.  One of the alternative I am thinking
 is to collapse the data during indexing time, but I haven't decided to go that way.

----- Original Message ----
From: Erick Erickson <erickerickson@gmail.com>
To: java-user@lucene.apache.org
Sent: Thursday, December 14, 2006 5:49:01 PM
Subject: Re: Duplicates removal in search results


you need to search for all documents with the title you care about, decide
which one to keep and remove all the others.

You'll probably need a TermDocs/TermEnum to go through all the items in your
index to create the list of documents to remove.

Erick

On 12/14/06, qaz zaq <fortques@yahoo.com> wrote:
>
> How can i remove the duplicates records in the search results. i.e., I
> have multiple results with the same title in 'title' field, and I want to
> only 1 record per title, how can I achieve that? thanks!!
>
>
> ---------------------------------
> Everyone is raving about the all-new Yahoo! Mail beta.
>


 
____________________________________________________________________________________
Any questions? Get answers on any topic at www.Answers.yahoo.com.  Try it now.
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message