lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christoph Goller <gol...@detego-software.de>
Subject Re: About Hit Scoring
Date Sun, 31 Oct 2004 17:01:38 GMT
Chuck Williams schrieb:
> Addendum:  I forgot probably the most important point.  The current
> normalization in Hits changes the final score so that it is not the
> distance to the query-orthogonal hyperplane.  This normalization renders
> the final score ambiguous, and more confused.  It's ambiguous since the
> normalization may or may not be applied (depending on the fairly
> arbitrary condition of whether or not the top raw score is greater than
> 1.0).  In cases where the normalization is applied, then a result's
> final score is the ratio of its distance from the query-orthogonal
> hyperplane to the largest distance of any result, which doesn't seem
> particularly meaningful to me.  At least there is no absolute
> interpretation for this score, in the sense that a specific number
> indicates a specific relevance, which is what I'm looking for.

Yes, I omitted this in my analysis as well as the additional coord term
in the score formula.

Instead of the current normalization in Hits, one could apply a
squeezing function to the scores, that would guarantee that scores
are always between 0 and 1, but keep the order of the original raw
scores, since it would be independet of the highest raw score.

Christoph

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message