lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Comparing apples & oranges?
Date Sat, 05 Nov 2011 16:09:11 GMT
What about Function Queries? They can essentially take field values
and use them as part of the score calculations....

Best
Erick

On Fri, Nov 4, 2011 at 6:28 AM, Martin Koch <mak@issuu.com> wrote:
> Hi List
>
> I have a solr index where I want to include numerical fields in my ranking
> function as well as keyword relevance. For example, each document has a
> document view count, and I'd like to increase the relevancy of documents
> that are read often, and penalize documents with a very low view count. I'm
> aware that this could be achieved with a filter as well, but ignore that
> for this question :) since this will be extended to other numerical fields.
>
> The keyword scoring works just fine and I can include the view count as a
> factor in the scoring, but I would like to somehow express that the view
> count accounts for e.g. 25% of the total score. This could be achieved by
> mapping the view count into some predetermined fixed range and then
> performing suitable arithmetic to scale to the score of the query. The
> score of the term query is normalized to queryNorm, so I'd like somehow to
> express that the view count score should be normalized to the queryNorm.
>
> If I look at the explain of how the score below is computed, the 17.4 is
> the part of the score that comes from term relevancy. Searching for another
> (set of) terms yields a different queryNorm, so I can't see how I can
> a-priori pick a scaling function (I've used log for this example) and boost
> factor that will give control of the final contribution of the view count
> to the score.
>
> 19.14161 = (MATCH) sum of:
>  17.403849 = (MATCH) max plus 0.1 times others of:
>    16.747877 = (MATCH) weight(document:water^4.0 in 1076362), product of:
>      0.22298127 = queryWeight(document:water^4.0), product of:
>        4.0 = boost
>        2.939238 = idf(docFreq=527730, maxDocs=3669552)
>        0.018965907 = queryNorm
>      75.108894 = (MATCH) fieldWeight(document:water in 1076362), product
> of:
>        25.553865 = tf(termFreq(document:water)=653)
>        2.939238 = idf(docFreq=527730, maxDocs=3669552)
>        1.0 = fieldNorm(field=document, doc=1076362)
> [snip]
>  1.7377597 = (MATCH) FunctionQuery(log(map(int(views),0.0,0.0,1.0))),
> product of:
>    1.8325089 = log(map(int(views)=68,min=0.0,max=0.0,target=1.0))
>    50.0 = boost
>    0.018965907 = queryNorm
>
> Thanks in advance for your help,
> /Martin
>

Mime
View raw message