lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael D. Curtin" <m...@curtin.com>
Subject Re: Help with mass delete from large index
Date Wed, 15 Feb 2006 15:22:46 GMT
Chandramohan wrote:

>>perform such a cull again, you might make several
>>distinct indexes (one per 
>>day, per week, per whatever) during that reindexing
>>so the next time will be 
>>much easier.
> 
> How would you search and consolidate the results
> across multiple indexes?  Hits from each index will
> have independent scoring.

Frankly, I ignore the scores in my application.  The data itself isn't English 
prose, so the TF/IDF calcuations are stretched at best, as a measure of 
relevance.  I presort the documents to be in "relevance" order (a popularity 
metric), then specify index ordering for the results.

If that wouldn't work for your application, it seems to me that large-enough 
sub-sections *would* produce equivalent scores.  That is, if the sub-indexes 
were big enough, one could directly compare scores, so a simple merge would 
work.  If the total document corpus is small, then the need for sub-indexes 
isn't there anyhow.

--MDC

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message