lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antony Bowesman <...@teamware.com>
Subject Merging indexes - which is best option?
Date Fri, 05 Sep 2008 05:35:04 GMT
I am creating several temporary batches of indexes to separate indices and 
periodically will merge those batches to a set of master indices.  I'm using 
IndexWriter#addIndexesNoOptimise(), but problem that gives me is that the master 
may already contain the index for that document and I get a duplicate.

Duplicates are prevented in the temporary index, because when adding Documents, 
I call IndexWriter#deleteDocuments(Term) with my UID, before I add the Document.

I have two choices

a) merge indexes then clean up any duplicates in the master (or vice versa). 
Probably IndexWriter.deleteDocuments(Term[]) would suit here with all the UIDs 
of the incoming documents.

b) iterate through the Documents in the temporary index and add them to the master

b sounds worse as it seems an IndexWriter's Analyzer cannot be null and I guess 
there's a penalty in assembling the Document from the reader.

Any views?
Antony







---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message