# lucene-java-user mailing list archives

##### Site index · List index
Message view
Top
From Ahmet Arslan <iori...@yahoo.com>
Subject Re: Search Score percentage, Should not be relative to the highest score
Date Mon, 03 Jan 2011 16:34:23 GMT
```
So, can we say that if you have something that gives you the "how many query terms matched"
info, will that satisfy your requirement?

Query: term1 term2

Doc1: term1 term2   => n=2 => %100
Doc2: term1 term2 term3 term4 => n=2 => %100
Doc3: term1 term1 term3   => n=1 => %50
Doc4: term2 term3 term4   => n=1 => %50

If yes Explanation will you give that info in coord part. For example coord(1/3) means one
query term matched and there are total 3 query terms.

Here is an example Explanation:

0.013397463 = (MATCH) product of:
0.040192388 = (MATCH) sum of:
0.040192388 = (MATCH) weight(pagetext:para in 34930), product of:
0.46250778 = queryWeight(pagetext:para), product of:
3.1780937 = idf(docFreq=5546, maxDocs=48977)
0.14552994 = queryNorm
0.086901 = (MATCH) fieldWeight(pagetext:para in 34930), product of:
1.0 = tf(termFreq(pagetext:para)=1)
3.1780937 = idf(docFreq=5546, maxDocs=48977)
0.02734375 = fieldNorm(field=pagetext, doc=34930)
0.33333334 = coord(1/3)

> Subject: Re: Search Score percentage, Should not be relative to the highest score
> To: java-user@lucene.apache.org
> Date: Monday, January 3, 2011, 3:09 PM
>
> Consider the following.
>
> Query: term1 term2
> Doc1: term1 term2
> Doc2: term1 term2 term3 term4
> Doc3: term1 term1 term3
> Doc4: term3 term4
>
> For the above documents, Doc1 and Doc2 will b exact match (
> as they contain
> all the terms in the search Query). Doc3 is partially match
> as it contains
> term1 only (we neglect the term frequency tf always 1
>
>
> The score percentage ( calculated by Lucene in Hits.java
> line 133) and will
> be
>
> Doc1: 100%
> Doc2: 100%
> Doc3:  80%
>
> This is not a problem at all, the problem occurs when there
> is no exact
> matching document as following:
>
> Query: term1 term2
> Doc1: term1 term3
> Doc2: term2  term3 term4
> Doc3: term1 term1 term3
> Doc4: term3 term4
>
>
> The score will be calculated as
>
> Doc1: 100%
> Doc2: 100%
> Doc3:  50%
>
> You can see that Doc1 and Doc2 got 100% despite that they
> are not exact
> match. but as they got the highest score, Lucene considers
> them 100% match.
>
> This is my problem
>
> All I need is to make the percentage correct in the second
> case so it will
> be something as
>
> Doc1: 50%
> Doc2: 50%
> Doc3:  30%
>
> I hope I made myself clear.
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Search-Score-percentage-Should-not-be-relative-to-the-highest-score-tp2183420p2184613.html
> Sent from the Lucene - Java Users mailing list archive at
> Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org