lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yonik Seeley <ysee...@gmail.com>
Subject term scoring (idf) question
Date Thu, 15 Sep 2005 20:27:18 GMT
I'm trying to figure out why idf is multiplied twice into the score of a 
term query.
It sort of makes sense if you have just one term... the original weight is 
idf*boost, and
the normalization factor is 1/(idf*boost), so you multiply in the idf again 
if you want the final score to contain an idf factor (instead of being 
normalized to 1.0).

But when you have a boolean query of two term queries, the amount of score 
each term contributes will vary with the square of their idfs, not their 
idfs. Is this what's intended? If so, the scoring formula one normally sees 
(tf*idf*boost*lengthNorm), doesn't seem to convey this fact.

-Yonik
Now hiring -- http://tinyurl.com/7m67g

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message