lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Francisco Sanmartin <francis...@olx.com>
Subject Re: threshold of result rankings
Date Fri, 30 May 2008 15:04:47 GMT
I've done that already. All you need to do is to create your custom 
request handler.

My handler, among other things, what it does is the following:

It receives a factor threshold, such as 0.85. This means that the score 
of the first document returned will be the assumed as the "best" 
matching document. Then the document number #30 (definable) or the last 
document if it returns less than 30, will be the "worst" document.

factor = 0.85 (for example)
bestScore = 1000 (for example)
worstScore = 500 (for example score of the document #30)
Then the handler applies the function : threshold = bestScore * factor + 
worstScore * (1 - factor)

in the example case the threshold = 925. This means that the documents 
whose score is above 925 are at least an 85% similar to the first 
document returned.

So we obtain the threshold based on the score of the documents returned. 
Why 30? Because statistically there is no much difference between 30 and 
50 or 100 (This may depend on the number of documents you want return, 
in my case is the best 3 or 4).

Once we get the threshold based on the score, all I need to do is to 
check if the score of the next document to include in the returning set 
is above the threshold.

If you need any further help, don't hesitate to ask for it.

Pako



Umar Shah wrote:
> Hi,
>
> is there some way of limiting the results  above some fixed threshold?
>
> thanks in anticipation
> -umar
>
>   


Mime
View raw message