Return-Path: Delivered-To: apmail-jackrabbit-commits-archive@www.apache.org Received: (qmail 84415 invoked from network); 29 Mar 2006 21:56:33 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 29 Mar 2006 21:56:33 -0000 Received: (qmail 94232 invoked by uid 500); 29 Mar 2006 21:56:33 -0000 Delivered-To: apmail-jackrabbit-commits-archive@jackrabbit.apache.org Received: (qmail 94135 invoked by uid 500); 29 Mar 2006 21:56:32 -0000 Mailing-List: contact commits-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@jackrabbit.apache.org Delivered-To: mailing list commits@jackrabbit.apache.org Received: (qmail 94126 invoked by uid 500); 29 Mar 2006 21:56:32 -0000 Delivered-To: apmail-incubator-jackrabbit-commits@incubator.apache.org Received: (qmail 94123 invoked by uid 99); 29 Mar 2006 21:56:32 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Mar 2006 13:56:32 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_HELO_PASS X-Spam-Check-By: apache.org Received: from [192.87.106.226] (HELO ajax.apache.org) (192.87.106.226) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Mar 2006 13:56:31 -0800 Received: from ajax.apache.org (localhost.localdomain [127.0.0.1]) by ajax.apache.org (Postfix) with ESMTP id A2F5D6ACAA for ; Wed, 29 Mar 2006 22:56:10 +0100 (BST) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Apache Wiki To: jackrabbit-commits@incubator.apache.org Date: Wed, 29 Mar 2006 21:56:10 -0000 Message-ID: <20060329215610.29936.12541@ajax.apache.org> Subject: [Jackrabbit Wiki] Update of "CommentsAboutPerformance" by RoyFielding X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Dear Wiki user, You have subscribed to a wiki page or wiki category on "Jackrabbit Wiki" for change notification. The following page has been changed by RoyFielding: http://wiki.apache.org/jackrabbit/CommentsAboutPerformance The comment on the change is: format and wording ------------------------------------------------------------------------------ - jackrabbit works fast and was tested with several millions of items of real-life data and depending on the persistence manager used little to no performance degradation was noticed. see http://article.gmane.org/gmane.comp.apache.jackrabbit.devel/3977. + == Experience Reports == - Bellow there are a few comments about some issues that might affect the performance. + Apache Jackrabbit works fast and was tested with several millions of items of real-life data and, depending on the persistence manager, little to no performance degradation was noticed. See [http://article.gmane.org/gmane.comp.apache.jackrabbit.devel/3977 email]. - Regarding the tree structure. Since each parent holds references to its children each time you add a child the parent becomes heavier. It causes a degradation in performance for write operations according to the number of children. I think it's better to use a deep hierarchy rather than a flat structure. I would recommend you to do some testing to establish the limits that suits your needs. + == Some issues that might effect performance == - Regarding node references. The problem described above also affects node references, i.e. adding a reference to a highly referenced node will be slower each time. IMHO this problem prevents a very common use case, i.e. tagging. + === Tree Structure === - Regarding the session handling. A transient item storage is bound to each session. The transient storage contains its own cache of nodes that are connected to the underlying persistent storage. The thing is that each time a node is modified, all the cached transient nodes are notified. Therefore the more open sessions you have the more expensive the write operation will be. I think you should try each session to perform write operations on nodes which are not under heavy load from other sessions. e.g. I think it's good practice to avoid write operations in the root node if the repository is to be accessed by a high number of sessions. I also think that it's a good practice to share a single anonymous session for read only access if possible, it would reduce the time that write actions will take. + Since each parent holds references to its children each time you add a child the parent becomes heavier. It causes a degradation in performance for write operations according to the number of children. I think it's better to use a deep hierarchy rather than a flat structure. I would recommend you to do some testing to establish the limits that suits your needs. - Regarding concurrency. Currently jackrabbit lacks fine grained locking for write operations. So, if the repository will be under heavy load I would consider an approach like the one used in Magnolia, I'm not sure if they still use it but the last time I checked they had a repository for authoring and another for publishing. + === Node References === + The problem described above also affects node references, i.e. adding a reference to a highly referenced node will be slower each time. IMHO this problem prevents a very common use case, i.e. tagging. + + === Session Handling === + + A transient item storage is bound to each session. The transient storage contains its own cache of nodes that are connected to the underlying persistent storage. The thing is that each time a node is modified, all the cached transient nodes are notified. Therefore the more open sessions you have the more expensive the write operation will be. I think you should try each session to perform write operations on nodes which are not under heavy load from other sessions. e.g. I think it's good practice to avoid write operations in the root node if the repository is to be accessed by a high number of sessions. I also think that it's a good practice to share a single anonymous session for read only access if possible, it would reduce the time that write actions will take. + + === Concurrency === + + Currently Jackrabbit lacks fine grained locking for write operations. So, if the repository will be under heavy load I would consider an approach like the one used in Magnolia, I'm not sure if they still use it but the last time I checked they had a repository for authoring and another for publishing. +