Return-Path: Delivered-To: apmail-jackrabbit-dev-archive@www.apache.org Received: (qmail 14564 invoked from network); 29 May 2007 07:49:38 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 29 May 2007 07:49:38 -0000 Received: (qmail 10053 invoked by uid 500); 29 May 2007 07:49:42 -0000 Delivered-To: apmail-jackrabbit-dev-archive@jackrabbit.apache.org Received: (qmail 9790 invoked by uid 500); 29 May 2007 07:49:41 -0000 Mailing-List: contact dev-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@jackrabbit.apache.org Delivered-To: mailing list dev@jackrabbit.apache.org Received: (qmail 9781 invoked by uid 99); 29 May 2007 07:49:41 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 May 2007 00:49:41 -0700 X-ASF-Spam-Status: No, hits=-100.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 May 2007 00:49:36 -0700 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id E00C771419E for ; Tue, 29 May 2007 00:49:15 -0700 (PDT) Message-ID: <32384796.1180424955910.JavaMail.jira@brutus> Date: Tue, 29 May 2007 00:49:15 -0700 (PDT) From: "Dominique Pfister (JIRA)" To: dev@jackrabbit.apache.org Subject: [jira] Commented: (JCR-929) Under Heavy load in a Cluster HTTP Threads Block and stall requests In-Reply-To: <8561188.1179474076849.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/JCR-929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499711 ] Dominique Pfister commented on JCR-929: --------------------------------------- Hi Ian, thank you for your stack traces and all your work! In your second thread dump, I'd say that the first one (Thread http-8080-Processor25) still holds the AbstractJournal's RWLock (acquired in SharedItemStateManager$Update.begin) and therefore the VM's state is similar to the first thread dump you provided: one thread holds the AbstractJournal's RWLock and will start an item update (1), while other threads interoperate with the LockManager and therefore lock that one (2). When the item update (1) triggers a synchronization on the journal (because another instance made some changes) it might encounter a lock operation and will try to inform the LockManager about this event. Because of all other threads in (2) this will cause the deadlock. IMO, to solve this problem, LockManager operations will have to adopt the same pattern as SharedItemStateManager updates already do: lock-and-sync the journal when the operation starts, unlock at the end of it. Kind regards Dominique > Under Heavy load in a Cluster HTTP Threads Block and stall requests > ------------------------------------------------------------------- > > Key: JCR-929 > URL: https://issues.apache.org/jira/browse/JCR-929 > Project: Jackrabbit > Issue Type: Bug > Components: core > Affects Versions: 1.3 > Environment: 2 Node Cluster, OSX, JDK 1.5 with DatabaseJournal, DatabasePersistanceManager, all content in DB, using WebDAV to load > Reporter: Ian Boston > Assignee: Dominique Pfister > Attachments: catalina.out.node1.txt, catalina.out.node2.txt > > > Under Heavy load created by mounting both nodes in the cluster in OSX Finder and then uploading large numebers of files to each node at the same time ( a few 1000), eventually one of the nodes stops responding and the Finder mount timesout and disconnects. > Once that happens that node becomes unusable. > More mount attempts will prompt for a password indicating HTTP is still running, but will timeout once the connection is authenticated. > Access by the Web Browser will prompt for a password, conenct and provide a once only listing of any collection in the workspace. If you try to refresh that collection, the HTTP request hangs forever. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.