lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-2936) score and explain don't match
Date Wed, 23 Feb 2011 03:06:38 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12998136#comment-12998136
] 

Robert Muir commented on LUCENE-2936:
-------------------------------------

Koji: the issue is the document boost of zero.

because of this, the explanation does not indicate a match by default (see Explanation.java):
{noformat}
  /**
   * Indicates whether or not this Explanation models a good match.
   *
   * <p>
   * By default, an Explanation represents a "match" if the value is positive.
   * </p>
   * @see #getValue
   */
  public boolean isMatch() {
    return (0.0f < getValue());
  }
{noformat}

Separately, we should decide what to do about norm values of zero. In my opinion, norm values
of zero should not necessarily decode to a floating point value of zero (we might want to
adjust our norm decoder by default to not do this). 

Otherwise, in addition to your problem, the search degrades into a pure boolean ranking model
(as TF and IDF are completely zeroed out).

This is really unlikely with the default relevance ranking (unless you use a boost of zero
or similar), but is possible e.g. if you use a different SmallFloat quantization. I raised
this issue on LUCENE-1360 where if you were to use this "short field" quantization on a large
document, what should we do?

So in my opinion, we should consider adjusting the NORM_TABLE in Similarity so that if the
norm happens to be zero, it does not decode to a float of zero. This will have no impact on
performance as its a statically calculated table.


> score and explain don't match
> -----------------------------
>
>                 Key: LUCENE-2936
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2936
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.9.4, 3.0.3, 3.1, 4.0
>            Reporter: Koji Sekiguchi
>            Priority: Minor
>         Attachments: TestScore.java
>
>
> I've faced this problem recently. I'll attach a program to reproduce the problem soon.
The program outputs the following:
> {noformat}
> ** score = 0.10003257
> ** explain
> 0.050016284 = (MATCH) product of:
>   0.15004885 = (MATCH) sum of:
>     0.15004885 = weight(f1:"note book" in 0), product of:
>       0.3911943 = queryWeight(f1:"note book"), product of:
>         0.61370564 = idf(f1: note=1 book=1)
>         0.6374299 = queryNorm
>       0.38356602 = fieldWeight(f1:"note book" in 0), product of:
>         1.0 = tf(phraseFreq=1.0)
>         0.61370564 = idf(f1: note=1 book=1)
>         0.625 = fieldNorm(field=f1, doc=0)
>   0.33333334 = coord(1/3)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message