Return-Path: Delivered-To: apmail-jackrabbit-users-archive@locus.apache.org Received: (qmail 78612 invoked from network); 18 Aug 2006 07:58:20 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 18 Aug 2006 07:58:20 -0000 Received: (qmail 69063 invoked by uid 500); 18 Aug 2006 07:58:20 -0000 Delivered-To: apmail-jackrabbit-users-archive@jackrabbit.apache.org Received: (qmail 69053 invoked by uid 500); 18 Aug 2006 07:58:19 -0000 Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@jackrabbit.apache.org Delivered-To: mailing list users@jackrabbit.apache.org Received: (qmail 69044 invoked by uid 99); 18 Aug 2006 07:58:19 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 Aug 2006 00:58:19 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: domain of marcel.reutegger@gmx.net designates 213.165.64.20 as permitted sender) Received: from [213.165.64.20] (HELO mail.gmx.net) (213.165.64.20) by apache.org (qpsmtpd/0.29) with SMTP; Fri, 18 Aug 2006 00:58:17 -0700 Received: (qmail invoked by alias); 18 Aug 2006 07:57:56 -0000 Received: from adsl-84-226-146-144.adslplus.ch (EHLO [192.168.0.100]) [84.226.146.144] by mail.gmx.net (mp023) with SMTP; 18 Aug 2006 09:57:56 +0200 X-Authenticated: #894343 Message-ID: <44E57300.7030408@gmx.net> Date: Fri, 18 Aug 2006 09:57:52 +0200 From: Marcel Reutegger User-Agent: Thunderbird 1.5 (Windows/20051201) MIME-Version: 1.0 To: users@jackrabbit.apache.org Subject: Re: Optimizing the index References: <5860763.post@talk.nabble.com> In-Reply-To: <5860763.post@talk.nabble.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Y-GMX-Trusted: 0 X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N The index in Jackrabbit will optimize itself based on configuration properties. There is currently no method you can call to optimize the index manually. If you think this is a needful enhancement can you please create a jira issue: http://issues.apache.org/jira/browse/JCR Per default index folders with 100 nodes are created initially. When there are 10 index folders they are merged and optimized into a single index folder with approximately 1000 (=10*100) nodes. Similarly when there are 10 index folders with each 1000 nodes those are merged and optimized into a single one. The resulting index folder will then have approximately 10000 nodes. The maximum number of nodes merged and optimized into a single index is controlled by the 'maxMergeDocs' parameter. The default value is 100'000. For your estimated number of documents this is too low. You should increase this value to at least 1'000'000. For a quick intro to the inner workings of the query engine see: http://jackrabbit.apache.org/doc/arch/operate/query.html See also the 'SearchIndex' section in: http://svn.apache.org/repos/asf/jackrabbit/trunk/jackrabbit/src/main/config/repository.xml regards marcel sowmi wrote: > How do I optimize the index that I am creating? Are there hooks in JackRabbit > to let me trigger an optimization? Right now, I have an index of around 120 > MB size with 400k documents in it. My ${repository}/workspaces/default/index > directory has around 19 folders. Can I optimize this somehow, as my total > document size when I am done will be around 10 million. Please advise. > > sowmi