lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Niraj Alok" <ni...@emacmillan.com>
Subject Re: score and frequency
Date Sat, 05 Jun 2004 05:13:55 GMT
I have set the searcher.setSimilarity  as well as also tried setting the
coord factor to 1.

The problem as given by an example is : Lets say I have titles to be
displayed depending upon the search.
E.g if i have "ice hockey" as the search item and if it is default
similarity, my results are :

ice hockey0.99999994
ice hockey0.75
ice hockey0.75
winter Olympics: hockey, ice, medallists0.17402513
ice age0.073680125
National Hockey League0.020266924
Cracking the Ice Age0.018420031
ground-ice0.011512519
ice hockey: British Sekonda Superleague Play-Off Championship:
finals0.0069075115
 (the numbers indicating the score).


But if i set the similarity as my overridden one, the results become:
ice hockey0.99999994
ice hockey0.75
ice hockey0.75
ice age0.22104037
winter Olympics: hockey, ice, medallists0.17402513
National Hockey League0.060800765
Cracking the Ice Age0.055260092
ground-ice0.034537554
ice hockey: British Sekonda Superleague Play-Off Championship:
finals0.020722535


I want all the titles which have both "ice" and "hockey" to come above the
rest (to have higher scores)
Meaning i would wish the results to appear like:

ice hockey
ice hockey
ice hockey
winter Olympics: hockey, ice, medallists
ice hockey: British Sekonda Superleague Play-Off Championship: finals
ice age
National Hockey League
Cracking the Ice Age
ground-ice

My overriden similarity class contains just this method:
public float coord(int overlap, int maxOverlap) {

return 1.0f;

}





I feel it is the weight factor which is producing indesirable results. Any
help in this regard would be highly appreciated.

Regards,
Niraj

----- Original Message -----
From: "Brisbart Franck" <Franck.Brisbart@kelkoo.net>
To: "Lucene Users List" <lucene-user@jakarta.apache.org>
Sent: Friday, June 04, 2004 8:46 PM
Subject: Re: score and frequency


> Hi,
>
> Be careful to set the default similarity
> 'Similarity.setDefault(similarity)' before creating your search instance
> (IndexSearcher).
> If you change the default similarity after, you'll still use the old one.
> You'd better use the 'searcher.setSimilarity' method on your searcher.
>
> Franck
>
>
> Phil brunet wrote:
> > Hi to all.
> >
> > Maybe the term frequency is not the only parameter you need to override
> > to "customize" the score attributed by Lucene.
> >
> > Maybe you should consider the normalisation factor, the idf and the
> > coord factor ?
> >
> > Philippe
> >
> >> From: "Niraj Alok" <niraj@emacmillan.com>
> >> Reply-To: "Lucene Users List" <lucene-user@jakarta.apache.org>
> >> To: "Lucene Users List" <lucene-user@jakarta.apache.org>
> >> Subject: Re: score and frequency
> >> Date: Fri, 4 Jun 2004 15:13:32 +0530
> >>
> >> Hi Erik,
> >>
> >> Thanks for the suggestion.
> >>
> >> I tried this:
> >> public class RelevanceSimilarity extends DefaultSimilarity
> >>
> >> {
> >>
> >> public float tf(float freq) {
> >>
> >> System.out.println("discounting frequency");
> >>
> >> return (float)1;
> >>
> >> }
> >>
> >> }
> >>
> >>
> >>
> >> and in my query class, I used :
> >>
> >> Similarity.setDefault(similarity);
> >>
> >> Hits hits = is.search(query);
> >>
> >> for(i = 0; i < hits.length(); i ++)
> >>
> >>   result = result + hits.score(i);
> >>
> >>
> >>
> >> However, this is still not giving me the expected result. Do I need to
do
> >> something else?
> >>
> >>
> >> Regards,
> >> Niraj
> >>
> >> ----- Original Message -----
> >> From: "Erik Hatcher" <erik@ehatchersolutions.com>
> >> To: "Lucene Users List" <lucene-user@jakarta.apache.org>
> >> Sent: Friday, June 04, 2004 1:55 PM
> >> Subject: Re: score and frequency
> >>
> >>
> >> > On Jun 4, 2004, at 2:52 AM, Niraj Alok wrote:
> >> > > Hi,
> >> > >
> >> > > I am having some problems with the score of lucene.
> >> > > I am trying to get the results displayed according to hits.score
and
> >> > > it is giving the results correctly.
> >> > > However I do not want the frequency factor to be used for the
> >> > > computation of the score.
> >> > >
> >> > > Is it possible to get the score which does not have the frequency
> >> > > factor in it ?
> >> >
> >> > Have a look at the javadocs for Similarity.  DefaultSimilarity is
used
> >> > unless otherwise specified.  You could subclass that and override
this:
> >> >
> >> >    public float tf(float freq) {
> >> >      return (float)Math.sqrt(freq);
> >> >    }
> >> >
> >> > and return 1.0.  This might give you the effect you want.
> >> >
> >> > Erik
> >> >
> >> >
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> >> > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >> >
> >>
> >>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> >> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >>
> >
> > _________________________________________________________________
> > Bloquez les fenĂȘtres pop-up, c'est gratuit ! http://toolbar.msn.fr
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >
>
>
> --
> Franck Brisbart
> R&D
> http://www.kelkoo.com
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message