Return-Path: Mailing-List: contact lucene-dev-help@jakarta.apache.org; run by ezmlm Delivered-To: mailing list lucene-dev@jakarta.apache.org Received: (qmail 64800 invoked from network); 7 Oct 2001 21:19:00 -0000 Received: from relay3.uswest.net (HELO relay1.uswest.net) (63.226.138.11) by daedalus.apache.org with SMTP; 7 Oct 2001 21:19:00 -0000 Received: (qmail 36415 invoked by uid 0); 7 Oct 2001 21:19:02 -0000 Received: from unknown (HELO earthlink.net) (65.100.117.194) by relay3.uswest.net with SMTP; 7 Oct 2001 21:19:02 -0000 Message-ID: <3BC0C6BE.6020705@earthlink.net> Date: Sun, 07 Oct 2001 15:18:54 -0600 From: Dmitry Serebrennikov User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.2) Gecko/20010726 Netscape6/6.1 X-Accept-Language: en-us MIME-Version: 1.0 To: lucene-dev@jakarta.apache.org Subject: question about TermQuery Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N I'm looking through the TermQuery code (and generally trying to understand exactly how the searching works) and I found this code that looks suspicious to me. It is very likeley that I just don't understand what's going on, but there is a chance that this is a bug, so I wanted to ask for clarification / review from Doug and others. In the TermQuery.normalize(float norm), weight is being multiplied first by the normalization factor (the argument) and then by the idf, that was stored in the TermQuery before. Although I can't say for sure that this is wrong, it does look suspect. First, idf is already factored into weight in the sumOfSquaredWeights() method, and second, if normalize is called multiple times, idf will be multiplied into weight over and over... Plus the comment in normalize doesn't really make sense, and the way the code is written makes me think that this is a problem caused by a CVS merge conflict, and that only the line "weight *= norm" should be in that method. Am I right? ====================================================== final float sumOfSquaredWeights(Searcher searcher) throws IOException { idf = Similarity.idf(term, searcher); weight = idf * boost; return weight * weight; // square term weights } final void normalize(float norm) { weight *= norm; // normalize for query weight *= idf; // factor from document } ======================================================