lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Elschot <>
Subject Re: Lopsided scores for each term in BooleanQuery
Date Mon, 18 Sep 2006 22:28:14 GMT
On Monday 18 September 2006 23:08, Andy Liu wrote:
> For multi-word queries, I would like to reward documents that contain a more
> even distribution of each word and penalize documents that have a skewed
> distribution.  For example, if my search query is:
> +content:fast +content:car
> I would prefer a document that contains each word an equal number of times
> over a document that contains the word "fast" 100 times and the word "car" 1
> time.  In other words, I would like to compare the scores of each
> BooleanQuery term and adjust the score according to the distribution.
> Can somebody point me in the right direction as to how I would implement
> this?

It's already there in which is the square root:

(sqrt(1) + sqrt(1)) > (sqrt(0) + sqrt(2))

Paul Elschot

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message