lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Davis, Daniel (NIH/NLM) [C]" <>
Subject RE: Only indexing changed documents
Date Fri, 07 Aug 2015 15:30:58 GMT
Thanks - key is that signature field will not be id, and overwriteDupes will be false:

      <bool name="overwriteDupes">false</bool>
      <str name="signatureField">sig</str>

-----Original Message-----
From: Upayavira [] 
Sent: Friday, August 07, 2015 11:22 AM
Subject: Re: Only indexing changed documents

Use the DedupUpdateProcessor, which can compute a signature based upon the specified fields.


On Fri, Aug 7, 2015, at 03:56 PM, Davis, Daniel (NIH/NLM) [C] wrote:
> I have an application that knows enough to tell me that a document has
> been updated, but not which document has been updated.    There aren't
> that many documents in this core/collection - just a couple of 1000.   So
> far I've just been pumping them all to the update handler every week, 
> but the business folk really want the database and the index to be
> synchronized when the back-end staff make an update.    As is typical in
> indexing, updates are more frequent than searchers (or at least are 
> expected to be once things pick-up - we may even reach a whopping 10k 
> documents at some point :))
> Each document has an id I wish to use as the unique ID, but I also want
> to compute a signature.   Is there some way I can use an
> updateRequestProcessorChain to throw away a document if its signature 
> and document id match based on real-time get?
> My apologies if this is a duplicate of a prior question - solr-user is 
> faily high traffic.
> Dan Davis, Systems/Applications Architect (Contractor), Office of 
> Computer and Communications Systems, National Library of Medicine, NIH

View raw message