lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Toke Eskildsen ...@statsbiblioteket.dk>
Subject Re: Federated relevance ranking
Date Mon, 06 Jun 2011 09:05:33 GMT
On Thu, 2011-06-02 at 21:51 +0200, Clint Gilbert wrote:
> We're also considering a home-grown scheme involving normalizing the
> denominators of all the index components in all our indices, based on
> the sums of counts obtained from all the indices.  This feels like
> re-inventing the wheel, and it's not clear to me yet that the low-level
> manipulation of indices that we'd need to do is even possible.

We're currently experimenting with this approach, albeit only for two
searchers. Since we have very little control of the secondary searcher,
just a basic search-API, we're really hacking and performing a query
rewrite based on term statistics. This only works for basic term queries
(no wildcards, ranges etc.), but fortunately our search logs show that
they are by far the most common.

The math is not too bad: Extract occurrence counts for the terms, sum
them, calculate the difference when sending a request to a specific
searcher and set a term boost in the textual query, so that the standard
ranking formula in Lucene will yield the desired score.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message