lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "beatriz ramos" <beatriz.ramos.mor...@gmail.com>
Subject Re: implementing our own Scorer (BM25)
Date Thu, 19 Oct 2006 15:35:36 GMT
Excuse me, I don't want to write a very long email.

This is the BM25 Scorer formule:

         log((N-f+0.5)/(f+0.5)) · (k1 + 1) · c  /  (c+k1·( (1-b)+b·l/L))
where

               N = total number of documents
               f = inverse frecuency (number of documents which contain the
term)
               c = term frecuency in a document
               l = lenght of document
               L = average document lenght
               k1, b = constants

I think f is the same as idf in default Lucene scorer formule and c is the
same as tf.
I implement BM25 Scorer formule in score method of BM25Scorer class (my own
Scorer class that extends of Scorer class)

                public class BM25Scorer extends Scorer{
                        public BM25Scorer(Similarity similarity) {
                                  super(similarity);
                        }
                }


The problem is that I would have to implement my own Similarity class with
some specific abstract methods like queryNorm(float sumOfSquaredWeights) but
I don't know how to calculate sumOfSquaredWeights with the parameters of
BM25 Scorer formule

Do I have to change only Query, Weigth and Scorer class or I need to create
my own Similarity class?

Thanks







On 19/10/06, Grant Ingersoll <gsingers@apache.org> wrote:
>
> Please provide more information about what you have done so far.
>
> On Oct 19, 2006, at 9:10 AM, beatriz ramos wrote:
>
> > Hello,
> > I'm trying to implement my own scoring algorithm with Lucene but I
> > don't get
> > any results.
> >
> > Lucene documentation explains how to implement new scoring,
> > modifying Query,
> > Weight and Scorer classes. I have tried this but doesn't work
> >
> > Do you have any idea?
> > I need some example to understand the process and modifications
> >
> > Thanks
>
> --------------------------
> Grant Ingersoll
> Sr. Software Engineer
> Center for Natural Language Processing
> Syracuse University
> 335 Hinds Hall
> Syracuse, NY 13244
> http://www.cnlp.org
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message