lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Clegg <andrew.cl...@gmail.com>
Subject Skipping duplicates in DataImportHandler based on uniqueKey
Date Sun, 02 May 2010 15:47:50 GMT

Hi,

Is there a way to get the DataImportHandler to skip already-seen records
rather than reindexing them?

The UpdateHandler has an <add overwrite="false" ... > capability which (as I
understand it) means that a document whose uniqueKey matches one already in
the index will be skipped instead of overwritten.

Can the DIH be made to behave this way?

If not, would it be an easy patch? This is using the XPathEntityProcessor by
the way.

Thanks,

Andrew.
--
:: http://biotext.org.uk/ ::
-- 
View this message in context: http://lucene.472066.n3.nabble.com/Skipping-duplicates-in-DataImportHandler-based-on-uniqueKey-tp771559p771559.html
Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message