Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 50401 invoked from network); 27 Sep 2006 02:45:00 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 27 Sep 2006 02:45:00 -0000 Received: (qmail 160 invoked by uid 500); 27 Sep 2006 02:44:54 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 124 invoked by uid 500); 27 Sep 2006 02:44:54 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 113 invoked by uid 99); 27 Sep 2006 02:44:54 -0000 Received: from idunn.apache.osuosl.org (HELO idunn.apache.osuosl.org) (140.211.166.84) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Sep 2006 19:44:54 -0700 Authentication-Results: idunn.apache.osuosl.org header.from=mekin.m@gmail.com; domainkeys=good Authentication-Results: idunn.apache.osuosl.org smtp.mail=mekin.m@gmail.com; spf=pass X-ASF-Spam-Status: No, hits=0.5 required=5.0 tests=DNS_FROM_RFC_ABUSE Received-SPF: pass (idunn.apache.osuosl.org: domain gmail.com designates 64.233.182.186 as permitted sender) DomainKey-Status: good X-DomainKeys: Ecelerity dk_validate implementing draft-delany-domainkeys-base-01 Received: from [64.233.182.186] ([64.233.182.186:11325] helo=nf-out-0910.google.com) by idunn.apache.osuosl.org (ecelerity 2.1.1.8 r(12930)) with ESMTP id 92/91-21307-3A5E9154 for ; Tue, 26 Sep 2006 19:44:51 -0700 Received: by nf-out-0910.google.com with SMTP id b2so338792nfe for ; Tue, 26 Sep 2006 19:44:48 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=Wb6n7FJASpzrA6ZMQ5H0w5Qvm+dMvB/MtbnHUwb+PPkrOX/FvhLJ9O9/J+m/FvMXWptzerCzLPUDd1oNao4ExN4KNHIkaIQ6+X/cWAbL7tKLkbPUx5fChJGlxvACKPRvu8xc8hoa44N1ADzO4pPzXxkCx/GY8Z+VZUS0X4pmSMY= Received: by 10.48.163.19 with SMTP id l19mr1790633nfe; Tue, 26 Sep 2006 19:44:48 -0700 (PDT) Received: by 10.49.37.16 with HTTP; Tue, 26 Sep 2006 19:44:48 -0700 (PDT) Message-ID: <865c77680609261944h335583acy6b6900ecb898e8eb@mail.gmail.com> Date: Wed, 27 Sep 2006 08:14:48 +0530 From: Mek To: java-user@lucene.apache.org Subject: Re: Very high fieldNorm for a field resulting in bad results In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <865c77680609250327j265039aeo43b8d771e405a5dc@mail.gmail.com> X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Thanks a lot Chris for the detailed & patitent response. > > The value of a the field norm for any field named "A" is typically the > lengthNorm of the field, times the document boost, times the field boost > for *each* Field instance added to the document with the name "A". > (lengthNorm is by default 1/swrt(num of terms)) That explains the very high value for the fieldNorm. The boost value became boost_vale^#of values in the field. A couple of more questions: 1. Can I do away with index-time boosting for fields & tweak query-time boosting for them ? I understand that doc level boosting is very useful while indexing. But for fields, both index-boost & query-boost are mutiples which lead to the score, so would it be safe to say that I can replace the index-time boost with query-time boosting. This allows me a lot of freedom to test different values without re-indexing which takes me about 6 hours. 2. When searching through the archive I had read a post by you, saying its possible to give exact matches much higher weightage by indexing the START & END from : http://www.nabble.com/What-are-norms--tf1919250.html#a5335856 "it is possible to score exact matches on (tokenized) fields very high without using lengthNorm by indexing START and END tokens for the field as well, and then including them in your sloppy phrase queries -- the "tighter" match will score highest." Can you please elaborate on this, Thanks a ton for the response, mekin --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org