Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 33391 invoked from network); 23 Apr 2009 21:21:05 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 23 Apr 2009 21:21:05 -0000 Received: (qmail 57984 invoked by uid 500); 23 Apr 2009 21:08:45 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 57964 invoked by uid 500); 23 Apr 2009 21:08:45 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 57954 invoked by uid 99); 23 Apr 2009 21:08:44 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Apr 2009 21:08:44 +0000 X-ASF-Spam-Status: No, hits=3.4 required=10.0 tests=FUZZY_CPILL,HTML_MESSAGE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.220.158] (HELO mail-fx0-f158.google.com) (209.85.220.158) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Apr 2009 21:08:33 +0000 Received: by fxm2 with SMTP id 2so844128fxm.5 for ; Thu, 23 Apr 2009 14:08:13 -0700 (PDT) MIME-Version: 1.0 Received: by 10.223.111.71 with SMTP id r7mr447328fap.59.1240520893246; Thu, 23 Apr 2009 14:08:13 -0700 (PDT) In-Reply-To: <7e536b1f0904231407v1083b060obb8a6173eb06227a@mail.gmail.com> References: <49DFA47C.3010906@stanford.edu> <49DFE0B8.4080208@stanford.edu> <49E23210.5000209@stanford.edu> <7e536b1f0904231338u49b32551rbe9c2f89f4f58a7b@mail.gmail.com> <7e536b1f0904231407v1083b060obb8a6173eb06227a@mail.gmail.com> Date: Thu, 23 Apr 2009 23:08:13 +0200 Message-ID: <7e536b1f0904231408g77a4416gd9f2d1fb6ff3e091@mail.gmail.com> Subject: Re: exponential boosts From: Marcus Herou To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=001636c5b2f02dac8c04683f4786 X-Virus-Checked: Checked by ClamAV on apache.org --001636c5b2f02dac8c04683f4786 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit But perhaps one could use a FieldCache somehow ? /M On Thu, Apr 23, 2009 at 11:07 PM, Marcus Herou wrote: > Yes I have considered it for 30 minutes :) > > How do one apply that in the real world ? > > If the only thing I get access to is the actual docId would it not be > really expensive to get the Document itself from the index and later use > some field in it as external lookup in some optimized structure for this ? > > Example, pseudo: > > *public* *float* customScore(*int* doc, *float* subQueryScore, *float* valSrcScore) > > { > *Document document = indexSearcher.doc(doc); > float score = MyOptimalHashStructure.getScore(document.get("someId")); > return score**subQueryScore*;* > > } > > This would not scale well right ? I mean gathering scores through 100M docs > would take some time I guess ? Or even 1M docs... > > Please push me in the right direction. > > Cheers > > //Marcus > > > > > > > > On Thu, Apr 23, 2009 at 10:58 PM, Doron Cohen wrote: > >> > >> > I think we are doing similar things, at least I am trying to implement >> > document boosting with pagerank. Having issues of howto appky the >> scoring >> > of >> > specific docs without actually reindex them. I feel something should be >> > done >> > at query time which looks at external data but do not know howto >> implement >> > that. Do you ? >> > >> >> Have you considered CustomScoreQuery in o.a.l.search.function ? It should >> allow >> incorporating external scores. >> >> Doron >> > > > > -- > Marcus Herou CTO and co-founder Tailsweep AB > +46702561312 > marcus.herou@tailsweep.com > http://www.tailsweep.com/ > http://blogg.tailsweep.com/ > -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 marcus.herou@tailsweep.com http://www.tailsweep.com/ http://blogg.tailsweep.com/ --001636c5b2f02dac8c04683f4786--