lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From shamik <sham...@gmail.com>
Subject Re: Indexing question on individual field update
Date Tue, 11 Feb 2014 21:37:36 GMT
Eric,

  Thanks for your reply. I should have given a better context. I'm currently
running an incremental crawl daily on this particular source and indexing
the documents. Incremental crawl looks for any change since last crawl date
based on the document publish date. But, there's no way for me to know if a
document has been deleted. To ensure that, I ran a full crawl on a weekend,
which basically re-index the entire content. After the full index is over, I
call a purge script, which deletes any content which is more than 24 hour
old, based on the indextimestamp field. 

The issue with atomic update is that it doesn't alter the indextimstamp
field. So even if I run a full crawl with atomic updates, the timestamp will
stick to its old value. Unfortunately, I can't rely on another date field
coming from the source as they are not consistent. That translates to the
fact that I can't remove stale content.

Let me know if I'm missing something here.

- Thanks,
Shamik





--
View this message in context: http://lucene.472066.n3.nabble.com/Indexing-question-on-individual-field-update-tp4116605p4116757.html
Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message