Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@apache.org Received: (qmail 25649 invoked from network); 28 Jul 2003 21:26:34 -0000 Received: from exchange.sun.com (192.18.33.10) by daedalus.apache.org with SMTP; 28 Jul 2003 21:26:34 -0000 Received: (qmail 8976 invoked by uid 97); 28 Jul 2003 21:29:15 -0000 Delivered-To: qmlist-jakarta-archive-lucene-user@nagoya.betaversion.org Received: (qmail 8969 invoked from network); 28 Jul 2003 21:29:14 -0000 Received: from daedalus.apache.org (HELO apache.org) (208.185.179.12) by nagoya.betaversion.org with SMTP; 28 Jul 2003 21:29:14 -0000 Received: (qmail 25356 invoked by uid 500); 28 Jul 2003 21:26:30 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 25343 invoked from network); 28 Jul 2003 21:26:30 -0000 Received: from sccrmhc13.comcast.net (204.127.202.64) by daedalus.apache.org with SMTP; 28 Jul 2003 21:26:30 -0000 Received: from lucene.com (12-210-200-74.client.attbi.com[12.210.200.74](untrusted sender)) by comcast.net (sccrmhc13) with SMTP id <2003072821263501600g0abre>; Mon, 28 Jul 2003 21:26:35 +0000 Message-ID: <3F259509.7@lucene.com> Date: Mon, 28 Jul 2003 14:26:33 -0700 From: Doug Cutting User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030701 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Lucene Users List Subject: Re: Indexing very large sets (10 million docs) References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Ryan Clifton wrote: > You seem to by implying that it is possible to optimize very large indexes. My index has a couple million records, but more importantly it's about 40 gigs in size. I have tried many times to optimize it and this always results in hitting the Linux file size limit. Is there a way to get around this? I have the merge factor and max merge docs set, but the optimization process seems to ignore those fields. On Redhat 8.0 I have built indexes whose total size is 49GB and whose largest file (the .prx) is 28GB. I haven't yet tried to build anything larger, so I don't know exactly where the limit is. Doug --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org