Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 5860 invoked from network); 18 Jul 2008 16:52:29 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 18 Jul 2008 16:52:29 -0000 Received: (qmail 1762 invoked by uid 500); 18 Jul 2008 16:52:21 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 1733 invoked by uid 500); 18 Jul 2008 16:52:21 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 1722 invoked by uid 99); 18 Jul 2008 16:52:21 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 Jul 2008 09:52:21 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy includes SPF record at spf.trusted-forwarder.org) Received: from [217.12.10.218] (HELO web26007.mail.ukl.yahoo.com) (217.12.10.218) by apache.org (qpsmtpd/0.29) with SMTP; Fri, 18 Jul 2008 16:51:27 +0000 Received: (qmail 34303 invoked by uid 60001); 18 Jul 2008 16:51:48 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.co.uk; h=Received:X-Mailer:Date:From:Subject:To:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-ID; b=4BSf8U94wBKBh0kISq3qRIsb0NfmP/oZrQGkk5s81Oo8NIQ6H2Ia0/ZjDiqQ7ieQYKa8JDmSVt5M4lfYCvTPpERx/dhDu/XmCouyRkI/yQ6UX1ki5Ld5cz5o5y0CyqBFGbthjIwtAVnSgEcd4OF9Up6auA5X3c2w239wXzpXzIc=; Received: from [62.189.26.100] by web26007.mail.ukl.yahoo.com via HTTP; Fri, 18 Jul 2008 16:51:48 GMT X-Mailer: YahooMailRC/1042.40 YahooMailWebService/0.7.218 Date: Fri, 18 Jul 2008 16:51:48 +0000 (GMT) From: mark harwood Subject: Re: Scaling To: java-user@lucene.apache.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Message-ID: <899463.34200.qm@web26007.mail.ukl.yahoo.com> X-Virus-Checked: Checked by ClamAV on apache.org >>I have no clue how large the impact could be =0A=0AI did do some benchmar= king of a scoring scheme based on local idf vs one with visibility of a glo= bal idf.=0AUsing randomized allocation of documents to shards and sufficien= t volumes of content in each index, the local idf policy produced identical= top results to the global idf policy for the vast majority of searches.=0A= =0ACheers=0AMark=0A=0A=0A=0A----- Original Message ----=0AFrom: Karl Wettin= =0ATo: java-user@lucene.apache.org=0ASent: Friday, = 18 July, 2008 2:33:29 PM=0ASubject: Re: Scaling=0A=0A=0A18 jul 2008 kl. 09.= 49 skrev Eric Bowman:=0A=0A> One thing I have trouble understanding is how = scoring works in this =0A> case. Does Lucene really "just work", or are t= here special things =0A> we have to do to make sure that the scores are co= herent so we can =0A> actually decide which was the best match? What kind= of constraints =0A> are there when breaking up the index into parts to ma= ke sure scoring =0A> remains coherent?=0A=0A=0AAFAIK the score would suffe= r from splitting up the index as tf/idf =0Athen only represent a part of t= he index, i.e. two identical docments =0Ain two indices would end up with = different scores as the index meta =0Adata is different. I have no clue ho= w large the impact could be nor if =0Athere are good and bad ways to split= an index.=0A=0AOne solution I can think of is to share complete index over= all nodes =0Abut restrict the results from each node to a subset of the i= ndex using =0Aa filter. This should produce the right score but will proba= bly be a =0Abit slower than splitting the index.=0A=0APerhaps it would be = possible to split the index for searching but use =0Aan alternative source= for scoring.=0A=0A=0A karl=0A=0A--------------------------------= -------------------------------------=0ATo unsubscribe, e-mail: java-user-u= nsubscribe@lucene.apache.org=0AFor additional commands, e-mail: java-user-h= elp@lucene.apache.org=0A=0A=0A _______________________________________= ___________________=0ANot happy with your email address?.=0AGet the one you= really want - millions of new email addresses available now at Yahoo! http= ://uk.docs.yahoo.com/ymail/new.html --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org