lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@lucene.com>
Subject Re: Lucene's Ranking Function
Date Wed, 11 Sep 2002 21:21:24 GMT
Clemens Marschner wrote:
>  score_d = sum_t(tf_q * idf_t / norm_q * tf_d * idf_t / norm_d_t * boost_t)
> * coord_q_d
> 
> One last thing I wondered about: Is idf_t really going into that equation
> twice?

Yes.  I think that's normal with tf/idf vector-space ranking methods.

> From what I see, idf_t/norm_q is completely left out, isn't it?

No.  It is computed once at the beginning of query processing.  See, for 
example, TermQuery.sumOfSquaredWeights() and TermQuery.normalize().  The 
former is called by the search code to compute norm_q and the latter is 
passed norm_q once it has been computed so that the clause's scores may 
be normalized.

> tf_q is applied although it is never calculated - if a term occurs more
> twice in the query (very unlikely, though) the whole sum is calculated
> twice. And for each term, the equation tf_d * idf_t / norm_d_t * boost_d *
> boost_f * boost_t is calculated.

You're right, tf_q is not in fact calculated.

Hope this helps.

Doug


--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message