lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From atawfik <contact.txl...@gmail.com>
Subject RE: How to properly correlate relevance in a search across multiple collections
Date Tue, 09 Sep 2014 06:42:11 GMT
Hi David,

It seems that MultiSearcher is deprecated in favor of MultiReader. Have a
look  here <https://issues.apache.org/jira/browse/LUCENE-2756>  . 

Regarding the meta search approach, you can normalize raw scores of
documents. There are many ways to do that. Just search for "normalization
scores in meta search". The key here is the nature of your collections. If
they contain the same type of documents, then you can fuse them with
different aggregation methods. If raw score is the issue, you can normalize
or use sum of reciprocal ranks, Borda Count or even a simple count. If the
documents are not the same type, then you try round robin. 

My concern is not combining the search results, but rather maintaining good
relevant documents at the top of the merged result.

I have a master degree in Information retrieval, where I studied meta search
and distributed search for almost three years. However, probably the simple
workarounds suggested above might do the job.



--
View this message in context: http://lucene.472066.n3.nabble.com/How-to-properly-correlate-relevance-in-a-search-across-multiple-collections-tp4157240p4157555.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message