lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <jimi.hulleg...@svensktnaringsliv.se>
Subject RE: Why is multiplicative boost prefered over additive?
Date Fri, 18 Mar 2016 14:57:29 GMT
On Friday, March 18, 2016 2:19 PM, apache@elyograg.org wrote:
> 
> The "max score" of a particular query can vary widely, and only has meaning within the
context of that query.  
> One query on an index might produce a max score of 0.944, so *every* document has a score
less than one, 
> while another query *on the same index* (that might even have some of the same result
documents) 
> might produce a max score of 12.7, so the top docs have a score *much* higher than one.
> 
> If your additive boost is 5, this represents a relative boost of over 500 percent for
the top docs 
> of the first query I talked about above, but less than 50% for the top docs of the second.

Thanks Shawn. I think I understand. I guess I was stuck in the mindset of having all original
scores within a defined interval. 

Although I still don't fully understand why solr can't normalize the score, so it is always
between say 0.0 and 100.0. Because surely solr knows what the maximum "raw score" is.

Sure, I have read the page "Scores As Percentages", but the main argument there against a
normalized score seems to be that it still doesn't make different queries truly "comparable",
but that's not what I'm after anyway. I would only use the normalized score in my own boost
calculation, nothing else.

But, anyway... Since the score(1+boost...) suggestion from Upayavira solves the problem with
weights, I guess I will start using multiplicative boosts now. :)

But it would be nice to see how other people handle weighted boosts. And, in general I find
it a bit hard to find concrete examples of queries where one combines multiple boost factors
(like date recency, popularity, document type etc). Most documentation seem to focus on *one*
factor only. Like "this is how you sort/score based on popularity", "this is how you get more
recent documents first" etc...

/Jimi
Mime
View raw message