lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dwaipayan Roy <dwaipayan....@gmail.com>
Subject Re: Explain Scoring function in LMJelinekMercerSimilarity Class
Date Wed, 21 Dec 2016 07:53:09 GMT
Waiting for an explanation for my query. Thank you very much.

On Tue, Dec 20, 2016 at 10:51 PM, Dwaipayan Roy <dwaipayan.roy@gmail.com>
wrote:

> Hello,
>
> Can anyone help me understand the scoring function in the
> LMJelinekMercerSimilarity class?
>
> The scoring function in LMJelinekMercerSimilarity is shown below:
> --------------------------------------------------------
> float score = stats.getTotalBoost() *
> (float)Math.log(1 + ((1 - lambda) * freq / docLen) / (lambda *
> ((LMStats)stats).getCollectionProbability()));
> --------------------------------------------------------
>
> Can anyone help explain the equation? I can understand the scoring effect
> when calculating the stat in the document, i.e.: (1 - lambda) * freq /
> docLen).
>
> I hope getCollectionProbability() returns col_freq(t) / col_size. Am I
> right?
>
> Also the boosting part is not clear to me (stats.getTotalBoost()).
>
> I want to reproduce the result of the scoring using LM-JM. Hence I want
> the details.
>
> Thanks.
> Dwaipayan Roy..
>



-- 
Dwaipayan Roy.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message