lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: can IndexWriter.addIndexes de-dupe documents?
Date Mon, 22 Feb 2010 22:02:29 GMT
addIndexes doesn't make this possible.

Maybe add the indexes but then make a 2nd pass to dedup?


On Mon, Feb 22, 2010 at 4:26 PM, jchang <> wrote:
> When I call IndexWriter.addIndexes, is there anything I can do to make it
> filter out duplicates based a certain field (or group of fields)?   If I
> know that the id field of the document is unique, can I make addIndexes know
> that if it finds a new document bat the same id, the new one is valid and
> the old one should be overwritten (or deleted and the new one added in its
> place)?
> I don't see anything like unique constraint in the Field class; I know
> Lucene is not a SQL database, but i just wanted to check to make sure I'm
> not missing anything.
> --
> View this message in context:
> Sent from the Lucene - Java Users mailing list archive at
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message