lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eugene Ezekiel" <echot...@gmail.com>
Subject Re: Help interpreting explanation
Date Fri, 03 Mar 2006 02:36:08 GMT
Thanks Yonik for the reply. I got just a couple more questions,

1) Why does the explanantion print so  many times?

2) Since my query is made up of multiple terms how do I know what term "x"
is referring to?




On 3/3/06, Yonik Seeley <yseeley@gmail.com> wrote:
>
> I think Lucene in Action does a good job of it.
> There is also a formula given in the javadoc for DefaultSimilarity
>
> http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Similarity.html
>
> See my comments below (inline)
>
> On 3/2/06, Eugene <echothis@gmail.com> wrote:
> > Hi All,
> >
> > I'm not sure how to interpret the result of the toString method of
> > Explanation.  I'm trying to see the values of each component of the
> > Default Similarity formula for a particular query and a doc.  Given
> > below is a sample of my Explanation output. Many thanks if anyone could
> > help explain some of the values or direct me to a place that does so.
> >
> > Explanation = 0.683103 = product of:
> >    1.7077575 = sum of:
> >      0.184242 = weight(Contents:x in 78), product of:
> >        0.13565542 = queryWeight(Contents:x), product of:
> the queryWeight is query-specific... it will have the same value
> for all documents matching the query.
> >          2.509232 = idf(docFreq=85)
> inverse document frequency... term "x" appears in 85 documents.
> >          0.054062527 = queryNorm
> queryNorm is a normalization factor... 1/sqrt(sum of all query weights
> squared)
>
> If you had a boost, it would also be multiplied into the queryWeight
> at this point.
> >        1.3581617 = fieldWeight(Contents:x in 78), product of:
> fieldWeight components are document specific.
> >          1.7320508 = tf(termFreq(Contents:x)=3)
> "x" appears 3 times in the field for this document
> >          2.509232 = idf(docFreq=85)
> same as the previous idf factor - 85 documents contain "x"
> >          0.3125 = fieldNorm(field=Contents, doc=78)
> the norm is calculated at index time... it's the length normalization
> factor (1/sqrt(num tokens in this field)) multipled by any on the
> field or document.
>
> >      0.184242 = weight(Contents:x in 78), product of:
> >        0.13565542 = queryWeight(Contents:x), product of:
> >          2.509232 = idf(docFreq=85)
> >          0.054062527 = queryNorm
> >        1.3581617 = fieldWeight(Contents:x in 78), product of:
> >          1.7320508 = tf(termFreq(Contents:x)=3)
> >          2.509232 = idf(docFreq=85)
> >          0.3125 = fieldNorm(field=Contents, doc=78)
> >      0.26218253 = weight(Contents:y in 78), product of:
> >        0.16182467 = queryWeight(Contents:y), product of:
> >          2.9932873 = idf(docFreq=52)
> >          0.054062527 = queryNorm
> >        1.6201642 = fieldWeight(Contents:y in 78), product of:
> >          1.7320508 = tf(termFreq(Contents:y)=3)
> >          2.9932873 = idf(docFreq=52)
> >          0.3125 = fieldNorm(field=Contents, doc=78)
>
>
> -Yonik
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


--
Regards,
Eugene

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message