Return-Path: Delivered-To: apmail-jackrabbit-dev-archive@www.apache.org Received: (qmail 15721 invoked from network); 20 Jun 2007 16:54:48 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 20 Jun 2007 16:54:48 -0000 Received: (qmail 29969 invoked by uid 500); 20 Jun 2007 16:54:50 -0000 Delivered-To: apmail-jackrabbit-dev-archive@jackrabbit.apache.org Received: (qmail 29939 invoked by uid 500); 20 Jun 2007 16:54:50 -0000 Mailing-List: contact dev-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@jackrabbit.apache.org Delivered-To: mailing list dev@jackrabbit.apache.org Received: (qmail 29926 invoked by uid 99); 20 Jun 2007 16:54:50 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Jun 2007 09:54:50 -0700 X-ASF-Spam-Status: No, hits=-100.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Jun 2007 09:54:46 -0700 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 12A0071417D for ; Wed, 20 Jun 2007 09:54:26 -0700 (PDT) Message-ID: <8398719.1182358466073.JavaMail.jira@brutus> Date: Wed, 20 Jun 2007 09:54:26 -0700 (PDT) From: "Christoph Kiehl (JIRA)" To: dev@jackrabbit.apache.org Subject: [jira] Commented: (JCR-974) Manage Lucene FieldCaches per index segment In-Reply-To: <2696063.1182268946626.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/JCR-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12506599 ] Christoph Kiehl commented on JCR-974: ------------------------------------- I tried building a test case, but you need a fairly large index to really see the benefitsof my patch. In our production environment the workspace index is 500MB in size and the jcr:system index is about 1200MB (and both of course still growing). With indexes as big as that the effect of the operation systems file system cache is not as big as in small test cases. In my small test case the performance with my patch was a bit worse for repeating queries on an unchanged repository. I think we should provide a little tool that takes the wikipedia content an puts it all into a test repository which could then be used for such test cases. What do you think? > Manage Lucene FieldCaches per index segment > ------------------------------------------- > > Key: JCR-974 > URL: https://issues.apache.org/jira/browse/JCR-974 > Project: Jackrabbit > Issue Type: Improvement > Components: query > Affects Versions: 1.3 > Reporter: Christoph Kiehl > Attachments: ItemStateManagerBasedSortComparator.patch, patch.txt, patch2.txt > > > Jackrabbit uses an IndexSearcher which searches on a single IndexReader which is most likely to be an instance of CachingMultiReader. On every search that does sorting or range queries a FieldCache is populated and associated with this instance of a CachingMultiReader. On successive queries which operate on this CachingMultiReader you will get a tremendous speedup for queries which can reuse those associated FieldCache instances. > The problem is that Jackrabbit creates a new CachingMultiReader _everytime_ one of the underlying indexes are modified. This means if you just change _one_ item in the repository you will need to rebuild all those FieldCaches because the existing FieldCaches are associated with the old instance of CachingMultiReader. > This does not only lead to slow search response times for queries which contains range queries or are sorted by a field but also leads to massive memory consumption (depending on the size of your indexes) because there might be multiple instances of CachingMultiReaders in use if you have a scenario where a lot of queries and item modifications are executed concurrently. > The goal is to keep those FieldCaches as long as possible. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.