lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew W. Bilotti" <mbilo...@csail.mit.edu>
Subject Explanation of Scoring
Date Mon, 10 May 2004 13:38:28 GMT

Hello,

I'm using Lucene 1.4 RC2, and I'm having trouble understanding how the 
scoring relates to document rank.  The work I am doing is going to depend 
very much on knowing exactly how the scoring algorithm works and printing 
out the the score components so that I can analyze contributions of 
individual terms by hand.  If anyone can help me do this, I would be truly 
grateful.

Particularly, I would like help understanding the following strange 
phenomenon:

I executed a query and the first document returned had a score of 0.592.
The explanation string read "0.0 = match required".  Can anyone tell me 
what this means?

The next 39 documents retrieved had steadily decreasing score with the 
same explanation string.

The 40th document retrieved, though had a score of 1.0 and the explanation 
string:

0.0 = fieldWeight(contents:invented in 0), product of:
  0.0 = tf(termFreq(contents:invented)=0)
  6.507968 = idf(docFreq=4189)
  0.0390625 = fieldNorm(field=contents, doc=0)

If anyone can help me decipher the explanation strings, or help me to
understand why a document with score 1.0 is ranked lower than a document
with score 0.592, or show me how to print out the individual components of 
the score as I am retrieving the documents, please reply.  My work very 
much depends on your help.

Regards,

Matthew Bilotti


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message