Return-Path: Delivered-To: apmail-jackrabbit-dev-archive@www.apache.org Received: (qmail 37485 invoked from network); 11 Oct 2006 03:54:53 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 11 Oct 2006 03:54:53 -0000 Received: (qmail 88916 invoked by uid 500); 11 Oct 2006 03:54:52 -0000 Delivered-To: apmail-jackrabbit-dev-archive@jackrabbit.apache.org Received: (qmail 88892 invoked by uid 500); 11 Oct 2006 03:54:51 -0000 Mailing-List: contact dev-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@jackrabbit.apache.org Delivered-To: mailing list dev@jackrabbit.apache.org Received: (qmail 88883 invoked by uid 99); 11 Oct 2006 03:54:51 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Oct 2006 20:54:51 -0700 X-ASF-Spam-Status: No, hits=1.4 required=10.0 tests=MSGID_MULTIPLE_AT X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy) Received: from [203.129.80.238] (HELO mail.dcivision.com) (203.129.80.238) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Oct 2006 20:54:50 -0700 Received: from FERDINANDCWF (unknown [218.189.137.162]) by mail.dcivision.com (Postfix) with ESMTP id A7ECA39C07D for ; Wed, 11 Oct 2006 11:57:34 +0800 (HKT) From: "Ferdinand Chan" To: References: <286825977104680910@unknownmsgid> <510143ac0610090707u27eaabc9lcda6c7615049406f@mail.gmail.com> <9f929f1c0610101031l3d4672a8ob59c8a467b484710@mail.gmail.com> In-Reply-To: <9f929f1c0610101031l3d4672a8ob59c8a467b484710@mail.gmail.com> Subject: RE: About Issue JCR-546 Date: Wed, 11 Oct 2006 11:54:18 +0800 Message-ID: <028c01c6ece8$ed494e50$c7dbeaf0$@chan@dcivision.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 12.0 Thread-Index: AcbskmfzyspTAvR5StSL5caOv2xGvQAVj1+g Content-Language: en-us X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N It seems that the problem is quite serious. Does anyone use Jackrabbit in production environment which can successfully find an alternative way to solve this problem? I am working on a Content Management system which requires a lot of Content I/O and a lot of versioning will take place. -----Original Message----- From: Miro Walker [mailto:miro.walker@gmail.com] Sent: Wednesday, October 11, 2006 1:31 AM To: dev@jackrabbit.apache.org Subject: Re: About Issue JCR-546 > My best advice for now has been to explicitly synchronize on the > repository instance whenever you are doing versioning operations. Note > that you can still do normal read and write operations concurrently > with versioning, so this isn't as bad as it could be. Perhaps we > should put that synchronization inside the versioning methods until > the concurrency issues are solved... The problem here is that "versioning operations" covers quite a lot. For us the real nasty is cloning nodes between workspaces, as we've used a content model that maps releases to workspaces. Publishing a release therefore involves cloning an entire workspace (which takes a few 10s of minutes). During this period no other write operations can take place. Putting synchronisation code inside the versioning methods would mean that the entire application locks up during this period, while having it outside in our own app means that we can be a bit more flexible with how we handle locking (e.g. use locks that timeout with an error rather than allowing the application to be completely locked for 30-60 mins at a time). There are a few areas of the code that cause this sort of problem - the other big one is indexing. In order to support a home-brewed failover mechanism for active-passive clustering we need to delete search indexes on failover (as they are likely to be corrupt in the event of failover). On subsequent startup the application needs to reindex each workspace independently when it is first accessed. This takes a few minutes to do, again locking users out while this takes place. I don't think there is a "quick fix" other than to go in and spend some time fixing the existing scenarios where deadlock can occur and doing some hardcore testing of concurrency issues. Miro