lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lboutros <boutr...@gmail.com>
Subject Re: Get the new terms of fields since last update
Date Fri, 05 Dec 2014 15:21:50 GMT
The Apache Solr community is sooo great !

Interesting problem with 3 interesting answers in less than 2 hours !

Thank you all, really.

Erik,

I'm already saving the billion of terms each week. It's hard to diff 1
billion of terms.
I'm already rebuilding the whole dictionaries each week in a custom
distributed terms query handler.

I'm saving the result in Mongo DB in order to scroll thru it quickly with
term position in the dictionary.

It takes 3-4 hours each week. Now I would like to update the result in order
to do it faster.

Alex, I will check, this seems to be a good idea.
Is it possible to filter terms with payloads in index readers ? I did not
see anything like that in my first investigation. 
I suppose it would take some additional disk space.

Michael,

this is the easiest way to do it. You are right. But I'm not sure that
indexing twice and update the dictionaries would be faster than the current
process. But it worth it to do some math ;)

Ludovic.





-----
Jouve
France.
--
View this message in context: http://lucene.472066.n3.nabble.com/Get-the-new-terms-of-fields-since-last-update-tp4172755p4172785.html
Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message