Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 65150 invoked from network); 23 Dec 2004 09:55:39 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 23 Dec 2004 09:55:39 -0000 Received: (qmail 93702 invoked by uid 500); 23 Dec 2004 09:55:17 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 93668 invoked by uid 500); 23 Dec 2004 09:55:16 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 93654 invoked by uid 99); 23 Dec 2004 09:55:15 -0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: pass (hermes.apache.org: local policy) Received: from Unknown (HELO mail.researchandmarkets.com) (217.114.165.228) by apache.org (qpsmtpd/0.28) with SMTP; Thu, 23 Dec 2004 01:55:11 -0800 Received: from cayman ([62.17.245.38] unverified) by mail.researchandmarkets.com with Microsoft SMTPSVC(5.0.2195.6713); Thu, 23 Dec 2004 10:05:33 +0000 From: "Garrett Heaver" To: "'Lucene Users List'" Subject: RE: addIndexes() Question Date: Thu, 23 Dec 2004 09:52:28 -0000 Message-ID: <00ed01c4e8d5$21c05f30$36a8a8c0@intranet.researchandmarkets.com> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: quoted-printable X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.6626 Importance: Normal X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1441 In-Reply-To: <5238CD8601F3EF4BA3C5553FECBB7D2A03884BAB@exchange.qsent.com> X-OriginalArrivalTime: 23 Dec 2004 10:05:33.0953 (UTC) FILETIME=[F123FF10:01C4E8D6] X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N Hi Ryan I too am using addIndexes(), all be it for slightly different reasons. However, I would recommend only calling addIndexes() for fairly sizable slices and all slices at once. The reason I'm suggesting it is that = optimize is called automagically both before and after the addIndexes method so = if you are only adding very small slices you're optimizing the main index = more times than necessary There is of course the obvious trade of "spider --> live index" time = being shorter in one method that the other. The other thing that I found on my machines (I'm spidering on one = machine and storing the live index on another) is that network performance isn't = so hot when you are continually opening and closing connections on other machines to do the merge (under NT this is, Linux may be much better :) = so it made more sense for me to create larger slices and only open the connection to the live index machine when necessary Hope this helps Garrett -----Original Message----- From: Ryan Aslett [mailto:Ryan.Aslett@Qsent.com]=20 Sent: 22 December 2004 23:45 To: Lucene Users List Subject: addIndexes() Question =20 Hi there, Im about to embark on a Lucene project of massive scale (between 500 million and 2 billion documents). I am currently working on parallellizing the construction of the Index(es).=20 Rough summary of my plan: I have many, many physical machines, each with multiple processors that I wish to dedicate to the construction of a single index.=20 I plan on having each machine gather its documents from a central sychronized source (network, JMS, whatever).=20 Within each machine I will have multiple threads each responsible for construcing an index slice. When all machines and all threads are finished, I should have a slew of index slices that I want to combine together to create one index. My question is this: Will it be more efficient to call addIndexes(Directory[] dirs) on all the slices all at once?=20 Or might it be better to continually merge small indexes into a larger index, i.e. once an index slice reaches a particular size, merge it into the main index and start building a new slice... Any help would be appreciated..=20 Ryan Aslett --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org