Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 60865 invoked from network); 3 Mar 2006 02:35:50 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 3 Mar 2006 02:35:50 -0000 Received: (qmail 89694 invoked by uid 500); 3 Mar 2006 02:36:34 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 89178 invoked by uid 500); 3 Mar 2006 02:36:31 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 89166 invoked by uid 99); 3 Mar 2006 02:36:31 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Mar 2006 18:36:31 -0800 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: domain of echothis@gmail.com designates 64.233.162.194 as permitted sender) Received: from [64.233.162.194] (HELO zproxy.gmail.com) (64.233.162.194) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Mar 2006 18:36:29 -0800 Received: by zproxy.gmail.com with SMTP id q3so598672nzb for ; Thu, 02 Mar 2006 18:36:09 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=KliVLHaxyeEKod1nDuZ++KJC8nejGMnbnWs9KxvmDEWhvzL8nuWhyKaXmBUWYjkZa3QlfryHEPajjaJO8Ymoeuand+FyRx/u5uGP6qxjqQav4ANoBbfHiH5VKbHumSdeZhiI7ixinxW/hWSnJ1YOK0UYPFpkzs6xa0Owe1g+JWk= Received: by 10.35.113.12 with SMTP id q12mr333861pym; Thu, 02 Mar 2006 18:36:08 -0800 (PST) Received: by 10.35.82.6 with HTTP; Thu, 2 Mar 2006 18:36:08 -0800 (PST) Message-ID: <5a2e14740603021836m2a700055x2a7462160ddfda9@mail.gmail.com> Date: Fri, 3 Mar 2006 10:36:08 +0800 From: "Eugene Ezekiel" To: java-user@lucene.apache.org Subject: Re: Help interpreting explanation In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_5780_21842476.1141353368842" References: <440731D5.1050101@gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N ------=_Part_5780_21842476.1141353368842 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Thanks Yonik for the reply. I got just a couple more questions, 1) Why does the explanantion print so many times? 2) Since my query is made up of multiple terms how do I know what term "x" is referring to? On 3/3/06, Yonik Seeley wrote: > > I think Lucene in Action does a good job of it. > There is also a formula given in the javadoc for DefaultSimilarity > > http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Similarit= y.html > > See my comments below (inline) > > On 3/2/06, Eugene wrote: > > Hi All, > > > > I'm not sure how to interpret the result of the toString method of > > Explanation. I'm trying to see the values of each component of the > > Default Similarity formula for a particular query and a doc. Given > > below is a sample of my Explanation output. Many thanks if anyone could > > help explain some of the values or direct me to a place that does so. > > > > Explanation =3D 0.683103 =3D product of: > > 1.7077575 =3D sum of: > > 0.184242 =3D weight(Contents:x in 78), product of: > > 0.13565542 =3D queryWeight(Contents:x), product of: > the queryWeight is query-specific... it will have the same value > for all documents matching the query. > > 2.509232 =3D idf(docFreq=3D85) > inverse document frequency... term "x" appears in 85 documents. > > 0.054062527 =3D queryNorm > queryNorm is a normalization factor... 1/sqrt(sum of all query weights > squared) > > If you had a boost, it would also be multiplied into the queryWeight > at this point. > > 1.3581617 =3D fieldWeight(Contents:x in 78), product of: > fieldWeight components are document specific. > > 1.7320508 =3D tf(termFreq(Contents:x)=3D3) > "x" appears 3 times in the field for this document > > 2.509232 =3D idf(docFreq=3D85) > same as the previous idf factor - 85 documents contain "x" > > 0.3125 =3D fieldNorm(field=3DContents, doc=3D78) > the norm is calculated at index time... it's the length normalization > factor (1/sqrt(num tokens in this field)) multipled by any on the > field or document. > > > 0.184242 =3D weight(Contents:x in 78), product of: > > 0.13565542 =3D queryWeight(Contents:x), product of: > > 2.509232 =3D idf(docFreq=3D85) > > 0.054062527 =3D queryNorm > > 1.3581617 =3D fieldWeight(Contents:x in 78), product of: > > 1.7320508 =3D tf(termFreq(Contents:x)=3D3) > > 2.509232 =3D idf(docFreq=3D85) > > 0.3125 =3D fieldNorm(field=3DContents, doc=3D78) > > 0.26218253 =3D weight(Contents:y in 78), product of: > > 0.16182467 =3D queryWeight(Contents:y), product of: > > 2.9932873 =3D idf(docFreq=3D52) > > 0.054062527 =3D queryNorm > > 1.6201642 =3D fieldWeight(Contents:y in 78), product of: > > 1.7320508 =3D tf(termFreq(Contents:y)=3D3) > > 2.9932873 =3D idf(docFreq=3D52) > > 0.3125 =3D fieldNorm(field=3DContents, doc=3D78) > > > -Yonik > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > -- Regards, Eugene ------=_Part_5780_21842476.1141353368842--