lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Mark Nemeskey (JIRA)" <>
Subject [jira] [Updated] (LUCENE-3220) Implement various ranking models as Similarities
Date Sat, 25 Jun 2011 17:55:47 GMT


David Mark Nemeskey updated LUCENE-3220:

    Attachment: LUCENE-3220.patch

Implementation of the DFR framework added. Lots of nocommits, though. I things to think about:
 * lots of (float) conversions. Maybe the inner API (BasicModel, etc.) could use doubles?
According to my experience, double is faster anyway, at least on 64bit architectures
 * I am not overly happy about the naming scheme, i.e. BasicModelBE, etc. Maybe we should
do it the same way as in Terrier, with a basicmodel package and class names like BE?
 * A regular SimilarityProvider implementation won't play well with DFRSimilarity, in case
the user wants to use several different setups. Actually, this is a problem for all similarities
that have parameters (e.g. BM25 with b and k).

Also, I think we need that NormConverter we talked earlier on irc, so that the Similarities
can run on any index.

> Implement various ranking models as Similarities
> ------------------------------------------------
>                 Key: LUCENE-3220
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Sub-task
>          Components: core/search
>    Affects Versions: flexscoring branch
>            Reporter: David Mark Nemeskey
>            Assignee: David Mark Nemeskey
>              Labels: gsoc
>         Attachments: LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch, LUCENE-3220.patch,
LUCENE-3220.patch, LUCENE-3220.patch
>   Original Estimate: 336h
>  Remaining Estimate: 336h
> With [LUCENE-3174|] done, we can finally
work on implementing the standard ranking models. Currently DFR, BM25 and LM are on the menu.
>  * {{EasyStats}}: contains all statistics that might be relevant for a ranking algorithm
>  * {{EasySimilarity}}: the ancestor of all the other similarities. Hides the DocScorers
and as much implementation detail as possible
>  * _BM25_: the current "mock" implementation might be OK
>  * _LM_
>  * _DFR_
> Done:

This message is automatically generated by JIRA.
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message