Return-Path: Delivered-To: apmail-incubator-jackrabbit-dev-archive@www.apache.org Received: (qmail 12939 invoked from network); 21 Nov 2005 18:03:28 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 21 Nov 2005 18:03:28 -0000 Received: (qmail 34864 invoked by uid 500); 21 Nov 2005 18:03:26 -0000 Mailing-List: contact jackrabbit-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jackrabbit-dev@incubator.apache.org Delivered-To: mailing list jackrabbit-dev@incubator.apache.org Received: (qmail 34672 invoked by uid 99); 21 Nov 2005 18:03:26 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 Nov 2005 10:03:25 -0800 Received-SPF: pass (asf.osuosl.org: domain of bcm@osafoundation.org designates 204.152.186.98 as permitted sender) Received: from [204.152.186.98] (HELO laweleka.osafoundation.org) (204.152.186.98) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 Nov 2005 10:04:57 -0800 Received: from localhost (localhost [127.0.0.1]) by laweleka.osafoundation.org (Postfix) with ESMTP id EADBE1422A2 for ; Mon, 21 Nov 2005 10:03:03 -0800 (PST) Received: from laweleka.osafoundation.org ([127.0.0.1]) by localhost (laweleka.osafoundation.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 20100-04 for ; Mon, 21 Nov 2005 10:03:03 -0800 (PST) Received: from [10.0.1.4] (c-67-188-209-63.hsd1.ca.comcast.net [67.188.209.63]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by laweleka.osafoundation.org (Postfix) with ESMTP id 9D5AC14229F for ; Mon, 21 Nov 2005 10:03:03 -0800 (PST) Message-ID: <43820BD6.2090307@osafoundation.org> Date: Mon, 21 Nov 2005 10:03:02 -0800 From: Brian Moseley User-Agent: Thunderbird 1.5 (Macintosh/20051025) MIME-Version: 1.0 To: jackrabbit-dev@incubator.apache.org Subject: tuning SearchIndex Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: by amavisd-new and clamav at osafoundation.org X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N while testing my caldav server, i've had a lot of seemingly arbitrary exceptions that i've tracked down to the jvm running out of file descriptors. after using ulimit to give the jvm 10k fds, i've found that the server seems to hit equilibrium at almost 1200 open fds, an astonishing amount. the exact number from the last run is 1178. even more astonishing, 1051 of those open fds are index files: java 12405 root 40r REG 9,1 22608 22875403 /home/cosmo-demo-roots/prod7/data/repository/workspaces/homedir/index/_0/_2y.cfs java 12405 root 41r REG 9,1 2856 22875406 /home/cosmo-demo-roots/prod7/data/repository/workspaces/homedir/index/_1/_8.cfs java 12405 root 42r REG 9,1 2291 22875409 /home/cosmo-demo-roots/prod7/data/repository/workspaces/homedir/index/_2/_8.cfs java 12405 root 43r REG 9,1 888 22940607 /home/cosmo-demo-roots/prod7/data/repository/workspaces/homedir/index/_3/_1.cfs etc etc etc ad nauseum. the test i'm conducting simulates the initial publication of a moderately-sized (500+ event) calendar. it does well over a thousand PUT requests, each of which adds an nt:file (plus caldav:resource mixin type) to the repository to store the uploaded file's contents. before each node is added, the server performs a query against the parent node, looking something like this (where XXXXXX is a bit of metadata about the uploaded file): /jcr:root/bcm/calendar//element(*, caldav:resource)[@caldav:uid = 'XXXXXX'] publication of calendars with this many events will likely happen infrequently, as individual users are added to the server, although when an instance of the server is first brought online, there will be a heavy wave of users publishing their calendars to the server for the first time. i don't know anything about lucene, but after looking at MultiIndex, i wonder if i'm having an issue with the frequency that the volatile index is persisted and/or the the persistent indexes are merged. i'm using the default SearchIndex configuration, that is to say: does anybody have advice on how to tune the SearchIndex? or am i barking up the wrong tree altogether? are there other subsystems that will be affected by this pattern of rapid writes in large quantities?