lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Willnauer <simon.willna...@googlemail.com>
Subject Re: Modifying score based on tf and slop
Date Thu, 07 May 2009 03:32:57 GMT
Hey,
On Thu, May 7, 2009 at 3:51 AM, Radha Sreedharan <radha84@gmail.com> wrote:
> Hi,
>
> I made tf return a 1.0f but the issue with that is that now the slop
> factor is neglected.
>
> So even if the tow terms in the span near query or far off or nearby
> the score returned is the same.
>
> I want the no of times of the term occurring to be neglected but not the slop.
So, do you mean you want to have the raw accumulated sloppy frequency
multiplied with the weight? It that what you want or do you wanna use
the default tf() implementation in SpanScorer?

Simon
>
>
> Radha
>
> On Thu, May 7, 2009 at 12:43 AM, Simon Willnauer
> <simon.willnauer@googlemail.com> wrote:
>> Hey,
>> If I get you right you wanna make tf not affecting the score at all.
>> if so why don't you just return 1.0f by overriding similarity?
>> If you just wanna do that for the query you are using you could
>> override Query#getSimilarity and return a delegate to the actual
>> similarity.
>>
>> Hope that helps.
>>
>> simon
>>
>> On Wed, May 6, 2009 at 7:44 PM, Radha Sreedharan <radha84@gmail.com> wrote:
>>> Hi all,
>>>
>>> All I have is a query running on a document with a single field which
>>> has some search value. This is all which will be present.
>>> No more documents / fields.
>>>
>>> I have the following specific requirements
>>>
>>> 1) Length of document should not affect score - Implemented as per
>>> lucene documentation using concept of Fair Similairty by making
>>> lengthnorm as 1
>>>
>>> 2) The no of times a term in the query  occurs in the search field
>>> should not affect the score
>>>
>>> 3) I am using the spannearquery. Hence the slop should affect the score.
>>>
>>>
>>> I implemented 2) by changing the tf to return 1 if freq >0 .
>>>
>>> But this adversely affects  3) as the slop value is factored into the
>>> tf ( as per what I can see in the span scorer)
>>>
>>>
>>> How can I ensure the frequency of a certain term does not affect the
>>> score while at the same ensuring that the slop does affect it ?
>>>
>>>
>>> Thanks,
>>> Radha
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message