Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 60655 invoked from network); 25 Jun 2004 16:31:06 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 25 Jun 2004 16:31:06 -0000 Received: (qmail 63438 invoked by uid 500); 25 Jun 2004 16:31:04 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 63349 invoked by uid 500); 25 Jun 2004 16:31:03 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 63250 invoked by uid 99); 25 Jun 2004 16:31:02 -0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [216.136.173.241] (HELO web12704.mail.yahoo.com) (216.136.173.241) by apache.org (qpsmtpd/0.27.1) with SMTP; Fri, 25 Jun 2004 09:31:00 -0700 Message-ID: <20040625163044.93039.qmail@web12704.mail.yahoo.com> Received: from [211.95.204.99] by web12704.mail.yahoo.com via HTTP; Fri, 25 Jun 2004 09:30:44 PDT Date: Fri, 25 Jun 2004 09:30:44 -0700 (PDT) From: Otis Gospodnetic Subject: Re: best mergeFactor for merging 100 Indexes To: Lucene Users List In-Reply-To: <20040625160043.H10707@neptune.ebi.ac.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N If this is an option, use compund index format (writer.setUseCompound(true)). ulimit -a in some UNIX shells will tell you the max number of open files allowed. If you can, increase that number as high as you can. Of course, how high you can go also depends on your RAM. Finally, don't forget there is a minMergeDocs parameter you can use to tune things, too. Otis --- Harald Kirsch wrote: > Hi, > > after an hour of indexing on a cluster I got 100 Indexes, ca. 25MB > each, 2 indexed fields. I intend now to run code roughly like > > IndexWriter writer = new IndexWriter(destDir, ...); > writer.addIndexes(my100IndexDirs); > writer.close() > > When I did this a year ago, I know I had tough problems getting > around > memory limitations and open file limits at the same time. In the end > it worked with writer.mergeFactor==4000, but I think it was a > specially tweaked kernel on Linux which I don't have anymore. > > Since I don't really understand yet how open files, segments, memory > use, indexing time and mergeFacter interact, I would appreciate a > good > gues how to combine these indexes. > > Which mergeFactor to use? > Use a different strategy then the 3 lines shown above? > > Thanks, > Harald. > > -- > ------------------------------------------------------------------------ > Harald Kirsch | kirsch@ebi.ac.uk | +44 (0) 1223/49-2593 > > --------------------------------------------------------------------- > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org > For additional commands, e-mail: lucene-user-help@jakarta.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org