lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-954) Toggle score normalization in Hits
Date Sun, 17 Feb 2008 13:38:34 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12569698#action_12569698
] 

Yonik Seeley commented on LUCENE-954:
-------------------------------------

Normalization is only applied to the queryWeight part of the score (the part the same for
all documents), but not to the fieldWeight.  idf and norms can both be > 1.

> Toggle score normalization in Hits
> ----------------------------------
>
>                 Key: LUCENE-954
>                 URL: https://issues.apache.org/jira/browse/LUCENE-954
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.2
>         Environment: any
>            Reporter: Christian Kohlschütter
>         Attachments: hits-scoreNorm.patch
>
>
> The current implementation of the "Hits" class sometimes performs score normalization.
> In particular, whenever the top-ranked score is bigger than 1.0, it is normalized to
a maximum of 1.0.
> In this case, Hits may return different score results than TopDocs-based methods.
> In my scenario (a federated search system), Hits delievered just plain wrong results.
> I was merging results from several sources, all having homogeneous statistics (similar
to MultiSearcher, but over the Internet using HTTP/XML-based protocols).
> Sometimes, some of the sources had a top-score greater than 1, so I ended up with garbled
results.
> I suggest to add a switch to enable/disable this score-normalization at runtime.
> My patch (attached) has an additional peformance benefit, since score normalization now
occurs only when Hits#score() is called, not when creating the Hits result list. Whenever
scores are not required, you save one multiplication per retrieved hit (i.e., at least 100
multiplications with the current implementation of Hits).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message