lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Radha Sreedharan <radh...@gmail.com>
Subject Re: Modifying score based on tf and slop
Date Sun, 05 Jul 2009 17:02:56 GMT
Hi all, I really need the soln for this quite urgently. I have looked around
quite a bit - I do know how to override the tf value in my custom similarity
class. But since tf is tied up with the span, ie the SpanScorer ties the tf
with the span, making tf return 1, leads to the other problem of slop not
affecting my score.

Eg: Query : Spannear( AB, BC, CD)

I) First consider my best possible case:

Doc : "AB BC CD"
Result  : 3  - Maximum possible score


II ) Using default implementation of tf in Similarity class:

Case 1 -  Doc : "AB BC BC CD"
Result :  4  - Actual score
% match :  ( actual score / max possible score) =  ( 4/3) > 100% - This is
Wrong as I dont want score to be affected by no of times BC occurs

Case 2 -  Doc : "AB BC xx yy xx yy CD"
Result :  2  - Actual score
% match :  ( actual score / max possible score) =  ( 2/3) < 100% - This is
Correct as I want score to be affected by slop distance among AB, BC, CD

III) Using Custom implementation of tf in Similarity class where tf always
returns 1:


Case 1 -  Doc : "AB BC BC CD"
Result :  3  - Actual score
% match :  ( actual score / max possible score) =  ( 3/3) = 100% - This is
correct as I dont want score to be affected by no of times BC occurs

Case 2 -  Doc : "AB BC xx yy xx yy CD"
Result :  3  - Actual score
% match :  ( actual score / max possible score) =  ( 3/3) = 100% - This is
wrong as I want score to be affected by slop distance among AB, BC, CD

Basically I want a way where in both my Case 1 and Case 2 give me the
expected result

On Tue, Jun 30, 2009 at 4:22 PM, Rads2029 <radha84@gmail.com> wrote:

>
> Restarting this thread.
> I did try out the soln mentioned by Simon below, however that did not work.
> As changing the tf implementation to return 1, adversely affected by span
> scoring . ie, the slop distance does not affect score if i make tf as 1.
>
> I had found a work around in some other way, but that has a hole. I really
> have no way other than to find a solution for this. :(
>
> To summarize, how to make sure tf does not affect the scoring but the span
> should still affect the scoring?
>
>
>
>
> Simon Willnauer wrote:
> >
> > Hey,
> > If I get you right you wanna make tf not affecting the score at all.
> > if so why don't you just return 1.0f by overriding similarity?
> > If you just wanna do that for the query you are using you could
> > override Query#getSimilarity and return a delegate to the actual
> > similarity.
> >
> > Hope that helps.
> >
> > simon
> >
> > On Wed, May 6, 2009 at 7:44 PM, Radha Sreedharan <radha84@gmail.com>
> > wrote:
> >> Hi all,
> >>
> >> All I have is a query running on a document with a single field which
> >> has some search value. This is all which will be present.
> >> No more documents / fields.
> >>
> >> I have the following specific requirements
> >>
> >> 1) Length of document should not affect score - Implemented as per
> >> lucene documentation using concept of Fair Similairty by making
> >> lengthnorm as 1
> >>
> >> 2) The no of times a term in the query  occurs in the search field
> >> should not affect the score
> >>
> >> 3) I am using the spannearquery. Hence the slop should affect the score.
> >>
> >>
> >> I implemented 2) by changing the tf to return 1 if freq >0 .
> >>
> >> But this adversely affects  3) as the slop value is factored into the
> >> tf ( as per what I can see in the span scorer)
> >>
> >>
> >> How can I ensure the frequency of a certain term does not affect the
> >> score while at the same ensuring that the slop does affect it ?
> >>
> >>
> >> Thanks,
> >> Radha
> >>
> >
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Modifying-score-based-on-tf-and-slop-tp23412168p24269846.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message