lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitry Serebrennikov <dmit...@earthlink.net>
Subject question about TermQuery
Date Sun, 07 Oct 2001 21:18:54 GMT
I'm looking through the TermQuery code (and generally trying to 
understand exactly how the searching works) and I found this code that 
looks suspicious to me. It is very likeley that I just don't understand 
what's going on, but there is a chance that this is a bug, so I wanted 
to ask for clarification / review from Doug and others.

In the TermQuery.normalize(float norm), weight is being multiplied first 
by the normalization factor (the argument) and then by the idf, that was 
stored in the TermQuery before. Although I can't say for sure that this 
is wrong, it does look suspect. First, idf is already factored into 
weight in the sumOfSquaredWeights() method, and second, if normalize is 
called multiple times, idf will be multiplied into weight over and 
over... Plus the comment in normalize doesn't really make sense, and the 
way the code is written makes me think that this is a problem caused by 
a CVS merge conflict, and that only the line "weight *= norm" should be 
in that method. Am I right?

======================================================
  final float sumOfSquaredWeights(Searcher searcher) throws IOException {
    idf = Similarity.idf(term, searcher);
    weight = idf * boost;
    return weight * weight;              // square term weights
  }

  final void normalize(float norm) {
    weight *= norm;                  // normalize for query
    weight *= idf;                  // factor from document
  }
======================================================



Mime
View raw message