Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 87296 invoked from network); 2 Jun 2008 05:24:44 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 2 Jun 2008 05:24:44 -0000 Received: (qmail 50488 invoked by uid 500); 2 Jun 2008 05:24:40 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 50444 invoked by uid 500); 2 Jun 2008 05:24:40 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 50433 invoked by uid 99); 2 Jun 2008 05:24:40 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 01 Jun 2008 22:24:40 -0700 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of cdoronc@gmail.com designates 209.85.200.171 as permitted sender) Received: from [209.85.200.171] (HELO wf-out-1314.google.com) (209.85.200.171) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Jun 2008 05:23:52 +0000 Received: by wf-out-1314.google.com with SMTP id 28so739449wfc.20 for ; Sun, 01 Jun 2008 22:24:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; bh=Eosqr+hbLM7sHuIE2bn/PSDYDTns/vv1F3Oamti8BV8=; b=YQlqaoFgNqd5mTAhdfycF7VLqLeMSC109QGrXsRx5pM7ZFXdGzpvY1r5Zm/W+LLtfaiPWENvD0CoJYsNKfvfRgMSRCZ/K2K9Lv5D910JM1UKfGCAabmJ1aTq6kJ5UxscLsCbPZNilIPpQ0Tc3zdTO8lLapX0NJDs/o4EJyhi8Gk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=BTf80hR2zlLJJY78UeX4sqVOqYAj1oW+Me+ade6moyBnFbjHEhqhaEtTljOiLvexs23E3CVLyYypM1M4r0R6O2iDO4ef3A8QurAts8i+pY3IV7T8tuO3ig457TDnvDYD00qZrFyNKUwcjVvYSEpwPBfDu+36fYfy70dLmE5wkeg= Received: by 10.142.188.4 with SMTP id l4mr154860wff.183.1212384248749; Sun, 01 Jun 2008 22:24:08 -0700 (PDT) Received: by 10.142.230.2 with HTTP; Sun, 1 Jun 2008 22:24:08 -0700 (PDT) Message-ID: Date: Mon, 2 Jun 2008 08:24:08 +0300 From: "Doron Cohen" To: java-user@lucene.apache.org Subject: Re: How to add PageRank score with lucene's relevant score in sorting In-Reply-To: <11e088b10805280302q933d019k2a7baad10d43e04f@mail.gmail.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_18608_3757963.1212384248735" References: <11e088b10805280302q933d019k2a7baad10d43e04f@mail.gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_18608_3757963.1212384248735 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline Hi Jarvis, > I have a problem that how to "combine" two score to sort the search > result documents. > for example I have 10 million pages in lucene index , and i know their > pagerank scores. i give a query to it , every docs returned have a > lucene-score, mark it as R (relevant score), and i also have its > pagerank score, mark it as P, what i need is i want to sort the search > result base on the value "P+R". You know if i store the pagerank score in > index and get it every search time , then compute P+R , then sort it , this > way is too slow. in my system , when the search hits 500000 result , the > sort may cost about 20s. > Check CustomScoreQuery in http://lucene.apache.org/java/2_3_2/api/core/org/apache/lucene/search/function/package-summary.html Probably something like this: - implement ValueSource on top of the pagerank values, - create a valueSourceQuery on top of it, - create a customScoreQuery on top of the original query and the valueSourceQuery. Note that by default, customScoreQuery multiplies the scores, but you can override this. Doron ------=_Part_18608_3757963.1212384248735--