lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: can IndexWriter.addIndexes de-dupe documents?
Date Mon, 22 Feb 2010 22:02:29 GMT
addIndexes doesn't make this possible.

Maybe add the indexes but then make a 2nd pass to dedup?

Mike

On Mon, Feb 22, 2010 at 4:26 PM, jchang <jchangkihatest@gmail.com> wrote:
>
> When I call IndexWriter.addIndexes, is there anything I can do to make it
> filter out duplicates based a certain field (or group of fields)?   If I
> know that the id field of the document is unique, can I make addIndexes know
> that if it finds a new document bat the same id, the new one is valid and
> the old one should be overwritten (or deleted and the new one added in its
> place)?
>
> I don't see anything like unique constraint in the Field class; I know
> Lucene is not a SQL database, but i just wanted to check to make sure I'm
> not missing anything.
>
>
> --
> View this message in context: http://old.nabble.com/can-IndexWriter.addIndexes-de-dupe-documents--tp27694763p27694763.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message