Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 23264 invoked from network); 18 Feb 2005 16:01:56 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 18 Feb 2005 16:01:56 -0000 Received: (qmail 11561 invoked by uid 500); 18 Feb 2005 16:01:49 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 11520 invoked by uid 500); 18 Feb 2005 16:01:49 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 11472 invoked by uid 99); 18 Feb 2005 16:01:49 -0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=RCVD_BY_IP,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (hermes.apache.org: domain of brogar@gmail.com designates 64.233.184.198 as permitted sender) Received: from wproxy.gmail.com (HELO wproxy.gmail.com) (64.233.184.198) by apache.org (qpsmtpd/0.28) with ESMTP; Fri, 18 Feb 2005 08:01:48 -0800 Received: by wproxy.gmail.com with SMTP id 71so615187wri for ; Fri, 18 Feb 2005 08:01:46 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:mime-version:content-type:content-transfer-encoding; b=tXi1t+TZFrfe8N0ILlROXVVXSkX/CjgFWj/pVb2nDgMS0lR6YQXSqtSxC2WnUmolgqVsT4XKRd1z7n2SrTtYjUU/HFkqC/s2gDvoHOYtscG0JR2o7lnXej5UVazJLY1J1oe4yD1CCctHGRG/waQqaQ2VAFCYsmXVwUgtQhwAklg= Received: by 10.54.16.79 with SMTP id 79mr36268wrp; Fri, 18 Feb 2005 08:01:45 -0800 (PST) Received: by 10.54.51.72 with HTTP; Fri, 18 Feb 2005 08:01:44 -0800 (PST) Message-ID: <34cc3b0a05021808012afba734@mail.gmail.com> Date: Fri, 18 Feb 2005 11:01:44 -0500 From: Chris D Reply-To: Chris D To: lucene-user@jakarta.apache.org Subject: Scalability of Lucene indexes Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N Hi all, I have a question about scaling lucene across a cluster, and good ways of breaking up the work. We have a very large index and searches sometimes take more time than they're allowed. What we have been doing is during indexing we index into 256 seperate indexes (depending on the md5sum) then distribute the indexes to the search machines. So if a machine has 128 indexes it would have to do 128 searches. I gave parallelMultiSearcher a try and it was significantly slower than simply iterating through the indexes one at a time. Our new plan is to somehow have only one index per search machine and a larger main index stored on the master. What I'm interested to know is whether having one extremely large index for the master then splitting the index into several smaller indexes (if this is possible) would be better than having several smaller indexes and merging them on the search machines into one index. I would also be interested to know how others have divided up search work across a cluster. Thanks, Chris --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org