lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Wartes <jwar...@whitepages.com>
Subject Re: Automate search results filtering based on scoring
Date Wed, 05 Mar 2014 19:09:49 GMT

It¹s worth mentioning that scores should not be considered comparable
across queries, so equating ³confidence² and ³score² is a tricky
proposition. 
That is, the maxScore for the search "field1:foo" may be 10.0, and the
maxScore for ³field1:bar² may be 1.0, but that doesn¹t mean the top result
for ³foo" is ten times better than the top result for ³bar². If might just
be that ³foo² and ³bar² have very different frequencies in your data.

You can work around this with a custom Similarity class. You¹d do this by
removing most of Solr/Lucene¹s term and document frequency scoring
intelligence, so that scoring is almost entirely a factor of which fields
matched with which boosts.

You could also truncate a given result set based on a percentage of the
maxScore for that result set, but even within a single result set, you
shouldn¹t assume that 1/3rd the score means 1/3rd the match quality.
You¹ll need to find numbers that work for you.



On 3/3/14, 7:50 PM, "Susheel Kumar" <susheel.kumar@thedigitalgroup.net>
wrote:

>Hi,
>
>We are looking to automate searches (name searches) & filter out the
>results based on some scoring confidence. Any suggestions on what
>different approaches we can use to pick only top closer matches and
>filter out rest of the results.
>
>
>Thanks,
>Susheel
>


Mime
View raw message