lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "J.Zhu" <J....@open.ac.uk>
Subject RE: implementing our own Scorer (BM25)
Date Thu, 19 Oct 2006 15:54:23 GMT
Hi, we are having a discussion in java-dev@lucene.apache.org about
implementing probabilistic language modelling approaches such as BM25 in
Lucene. Hope you can join us there.

Jianhan 

-----Original Message-----
From: beatriz ramos [mailto:beatriz.ramos.moreno@gmail.com] 
Sent: 19 October 2006 16:36
To: java-user@lucene.apache.org
Cc: gsingers@apache.org
Subject: Re: implementing our own Scorer (BM25)

Excuse me, I don't want to write a very long email.

This is the BM25 Scorer formule:

         log((N-f+0.5)/(f+0.5)) * (k1 + 1) * c  /  (c+k1*( (1-b)+b*l/L))
where

               N = total number of documents
               f = inverse frecuency (number of documents which contain
the
term)
               c = term frecuency in a document
               l = lenght of document
               L = average document lenght
               k1, b = constants

I think f is the same as idf in default Lucene scorer formule and c is
the same as tf.
I implement BM25 Scorer formule in score method of BM25Scorer class (my
own Scorer class that extends of Scorer class)

                public class BM25Scorer extends Scorer{
                        public BM25Scorer(Similarity similarity) {
                                  super(similarity);
                        }
                }


The problem is that I would have to implement my own Similarity class
with some specific abstract methods like queryNorm(float
sumOfSquaredWeights) but I don't know how to calculate
sumOfSquaredWeights with the parameters of
BM25 Scorer formule

Do I have to change only Query, Weigth and Scorer class or I need to
create my own Similarity class?

Thanks







On 19/10/06, Grant Ingersoll <gsingers@apache.org> wrote:
>
> Please provide more information about what you have done so far.
>
> On Oct 19, 2006, at 9:10 AM, beatriz ramos wrote:
>
> > Hello,
> > I'm trying to implement my own scoring algorithm with Lucene but I 
> > don't get any results.
> >
> > Lucene documentation explains how to implement new scoring, 
> > modifying Query, Weight and Scorer classes. I have tried this but 
> > doesn't work
> >
> > Do you have any idea?
> > I need some example to understand the process and modifications
> >
> > Thanks
>
> --------------------------
> Grant Ingersoll
> Sr. Software Engineer
> Center for Natural Language Processing Syracuse University
> 335 Hinds Hall
> Syracuse, NY 13244
> http://www.cnlp.org
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message