lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Will Martin <wmartin...@gmail.com>
Subject Re: Explain Scoring function in LMJelinekMercerSimilarity Class
Date Tue, 20 Dec 2016 20:27:19 GMT
https://doi.org/10.3115/981574.981579



On 12/20/2016 12:21 PM, Dwaipayan Roy wrote:
> Hello,
>
> Can anyone help me understand the scoring function in the
> LMJelinekMercerSimilarity class?
>
> The scoring function in LMJelinekMercerSimilarity is shown below:
> --------------------------------------------------------
> float score = stats.getTotalBoost() *
> (float)Math.log(1 + ((1 - lambda) * freq / docLen) / (lambda *
> ((LMStats)stats).getCollectionProbability()));
> --------------------------------------------------------
>
> Can anyone help explain the equation? I can understand the scoring effect
> when calculating the stat in the document, i.e.: (1 - lambda) * freq /
> docLen).
>
> I hope getCollectionProbability() returns col_freq(t) / col_size. Am I
> right?
>
> Also the boosting part is not clear to me (stats.getTotalBoost()).
>
> I want to reproduce the result of the scoring using LM-JM. Hence I want the
> details.
>
> Thanks.
> Dwaipayan Roy..
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message