lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew W. Bilotti" <>
Subject Help with scoring, coordination factor?
Date Fri, 30 Apr 2004 22:15:59 GMT

> In my case it works perfectly. As we generate multilingual and semantic
> expansions of the original words of a query, the coordination factor was
> giving lower score to words with a lot of semantic or morphologic 
> variants.

For me, this has not worked.  I have defined a WordQuery class and used it 
to define my disjunctions, but I am finding that the documents I am 
interested in are still suffering rank penalties.

I wanted to try to understand how the scoring was working internally, so 
for each document in my Hits, I printed the score and an Explanation,
when quering on the original forms of each word only (no WordQueries 

The first document returned had a score of 0.592 and an explanation of 
"0.0 = match required".  Can anyone tell me what this means?  The next 39 
documents retrieved have the same explanation, and steadily decreasing 
scores, which makes sense.  The 40th document retrieved, though, has a 
score of 1.0 and the explanation:

0.0 = fieldWeight(contents:invented in 0), product of:
  0.0 = tf(termFreq(contents:invented)=0)
  6.507968 = idf(docFreq=4189)
  0.0390625 = fieldNorm(field=contents, doc=0)

Can anyone help me understand why a document with score 1.0 is retrieved 
directly after a document with score 0.211?  I don't understand the 
explanation.  Why is the term frequency of "invented" 0?  It should be 3; 
I checked the document.  I tried to delve into the code to find out how to 
print all of the components of the score to the screen (especially coord, 
which I am interested in), but I couldn't figure out how to do it.

Any help or hints you can give me would be truly appreciated.

~ Matthew

matthew w. bilotti
computer science and artificial intelligence laboratory
massachusetts institute of technology

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message