Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3B8217603 for ; Sat, 5 Nov 2011 16:09:40 +0000 (UTC) Received: (qmail 3915 invoked by uid 500); 5 Nov 2011 16:09:37 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 3870 invoked by uid 500); 5 Nov 2011 16:09:37 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 3860 invoked by uid 99); 5 Nov 2011 16:09:37 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 05 Nov 2011 16:09:37 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of erickerickson@gmail.com designates 209.85.216.48 as permitted sender) Received: from [209.85.216.48] (HELO mail-qw0-f48.google.com) (209.85.216.48) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 05 Nov 2011 16:09:32 +0000 Received: by qadb14 with SMTP id b14so4303121qad.35 for ; Sat, 05 Nov 2011 09:09:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=UEXEbS3H3qUmqTnEZvpk01Xo/Oah0J/FVRNVo2/075g=; b=O5u1JYvEV6uRw5cNDH1EXNcJ5JJxzMFfrXjJbdGLyDBEqBYGuDj5WoUNDAkt+ToIzm zPsNY1ikebhEYQ48iPutAC4fsDlYMfvtIR827M7Nzw9GznbPuD28as+BljjNzkDxx98h QBSVjkoT8PovASTcMAVP7G5rBiHUBGQhKs7yE= MIME-Version: 1.0 Received: by 10.182.73.67 with SMTP id j3mr4893372obv.46.1320509351299; Sat, 05 Nov 2011 09:09:11 -0700 (PDT) Received: by 10.182.37.102 with HTTP; Sat, 5 Nov 2011 09:09:11 -0700 (PDT) In-Reply-To: References: Date: Sat, 5 Nov 2011 12:09:11 -0400 Message-ID: Subject: Re: Comparing apples & oranges? From: Erick Erickson To: solr-user@lucene.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable What about Function Queries? They can essentially take field values and use them as part of the score calculations.... Best Erick On Fri, Nov 4, 2011 at 6:28 AM, Martin Koch wrote: > Hi List > > I have a solr index where I want to include numerical fields in my rankin= g > function as well as keyword relevance. For example, each document has a > document view count, and I'd like to increase the relevancy of documents > that are read often, and penalize documents with a very low view count. I= 'm > aware that this could be achieved with a filter as well, but ignore that > for this question :) since this will be extended to other numerical field= s. > > The keyword scoring works just fine and I can include the view count as a > factor in the scoring, but I would like to somehow express that the view > count accounts for e.g. 25% of the total score. This could be achieved by > mapping the view count into some predetermined fixed range and then > performing suitable arithmetic to scale to the score of the query. The > score of the term query is normalized to queryNorm, so I'd like somehow t= o > express that the view count score should be normalized to the queryNorm. > > If I look at the explain of how the score below is computed, the 17.4 is > the part of the score that comes from term relevancy. Searching for anoth= er > (set of) terms yields a different queryNorm, so I can't see how I can > a-priori pick a scaling function (I've used log for this example) and boo= st > factor that will give control of the final contribution of the view count > to the score. > > 19.14161 =3D (MATCH) sum of: > =A017.403849 =3D (MATCH) max plus 0.1 times others of: > =A0 =A016.747877 =3D (MATCH) weight(document:water^4.0 in 1076362), produ= ct of: > =A0 =A0 =A00.22298127 =3D queryWeight(document:water^4.0), product of: > =A0 =A0 =A0 =A04.0 =3D boost > =A0 =A0 =A0 =A02.939238 =3D idf(docFreq=3D527730, maxDocs=3D3669552) > =A0 =A0 =A0 =A00.018965907 =3D queryNorm > =A0 =A0 =A075.108894 =3D (MATCH) fieldWeight(document:water in 1076362), = product > of: > =A0 =A0 =A0 =A025.553865 =3D tf(termFreq(document:water)=3D653) > =A0 =A0 =A0 =A02.939238 =3D idf(docFreq=3D527730, maxDocs=3D3669552) > =A0 =A0 =A0 =A01.0 =3D fieldNorm(field=3Ddocument, doc=3D1076362) > [snip] > =A01.7377597 =3D (MATCH) FunctionQuery(log(map(int(views),0.0,0.0,1.0))), > product of: > =A0 =A01.8325089 =3D log(map(int(views)=3D68,min=3D0.0,max=3D0.0,target= =3D1.0)) > =A0 =A050.0 =3D boost > =A0 =A00.018965907 =3D queryNorm > > Thanks in advance for your help, > /Martin >