lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "beatriz ramos" <beatriz.ramos.mor...@gmail.com>
Subject implementing our own Scorer (BM25)
Date Fri, 20 Oct 2006 08:03:35 GMT
Hello,
I'm trying to implement my own scoring algorithm with Lucene but I don't get
any results.

Lucene documentation explains how to implement new scoring, modifying Query,
Weight and Scorer classes. I have tried this but doesn't work

This is the BM25 Scorer formule:

         log((N-f+0.5)/(f+0.5)) · (k1 + 1) · c  /  (c+k1·( (1-b)+b·l/L))
where

               N = total number of documents
               f = inverse frecuency (number of documents which contain the
term)
               c = term frecuency in a document
               l = lenght of document
               L = average document lenght
               k1, b = constants

I think f is the same as idf in default Lucene scorer formule and c is the
same as tf.
I implement BM25 Scorer formule in score method of BM25Scorer class (my own
Scorer class that extends of Scorer class)

                public class BM25Scorer extends Scorer{
                        public BM25Scorer(Similarity similarity) {
                               super(similarity);
                        }
                }


The problem is that I would have to implement my own Similarity class with
some specific abstract methods like queryNorm(float sumOfSquaredWeights) but
I don't know how to calculate sumOfSquaredWeights with the parameters of
BM25 Scorer formule

Do I have to change only Query, Weigth and Scorer class or I need to create
my own Similarity class?

Thanks

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message