lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Johnny R. Ruiz III" <jo...@yahoo.com>
Subject Re: Index Dedupe
Date Tue, 02 Oct 2007 10:16:23 GMT
Hi Daniel, 

Tnx, but forgive my ignorance..  can u give me a sample code to do it :).   I have never used
termDocs() before. 

Tnx,
Johnny

----- Original Message ----
From: Daniel Noll <daniel@nuix.com>
To: java-user@lucene.apache.org
Sent: Tuesday, October 2, 2007 12:00:07 PM
Subject: Re: Index Dedupe

On Tuesday 02 October 2007 12:25:47 Johnny R. Ruiz III wrote:
> Hi,
>
> I can't seem to find a way to delete duplicate in lucene index.  I hve  a
> unique key so it seems to be straight forward.  But I can't find a simple
> way  to do it except for putting  each record in the index into HashMap. 
> Are there any method in lucene package that I could use?

I would use termDocs() to iterate all the terms in that field.  Then skip the 
first doc for each term and delete all subsequent ones.

Daniel

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org








       
____________________________________________________________________________________
Need a vacation? Get great deals
to amazing places on Yahoo! Travel.
http://travel.yahoo.com/
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message