Return-Path: Delivered-To: apmail-jackrabbit-dev-archive@www.apache.org Received: (qmail 42871 invoked from network); 1 Jun 2007 17:05:42 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 1 Jun 2007 17:05:42 -0000 Received: (qmail 43034 invoked by uid 500); 1 Jun 2007 17:05:44 -0000 Delivered-To: apmail-jackrabbit-dev-archive@jackrabbit.apache.org Received: (qmail 43002 invoked by uid 500); 1 Jun 2007 17:05:44 -0000 Mailing-List: contact dev-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@jackrabbit.apache.org Delivered-To: mailing list dev@jackrabbit.apache.org Received: (qmail 42993 invoked by uid 99); 1 Jun 2007 17:05:43 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Jun 2007 10:05:43 -0700 X-ASF-Spam-Status: No, hits=-100.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Jun 2007 10:05:39 -0700 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id B3D5071418F for ; Fri, 1 Jun 2007 10:05:18 -0700 (PDT) Message-ID: <16059162.1180717518734.JavaMail.jira@brutus> Date: Fri, 1 Jun 2007 10:05:18 -0700 (PDT) From: "Xiaohua Lu (JIRA)" To: dev@jackrabbit.apache.org Subject: [jira] Commented: (JCR-929) Under Heavy load in a Cluster HTTP Threads Block and stall requests In-Reply-To: <8561188.1179474076849.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/JCR-929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500761 ] Xiaohua Lu commented on JCR-929: -------------------------------- I had a similar problem but the stack trace is slight different The setup is a 4 nodes cluster and under heavy load (mainly updates), they all hang, from database side, three transaction updates are waiting for a select lock. The select lock seems to be blocked by one of the threads underneath thread 1 Thread 25141: (state = BLOCKED) - java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be imprecise) - java.lang.Object.wait() @bci=2, line=474 (Compiled frame) - org.apache.jackrabbit.core.journal.AbstractJournal.sync() @bci=9, line=160 (Compiled frame) - org.apache.jackrabbit.core.cluster.ClusterNode.sync() @bci=27, line=283 (Interpreted frame) - org.apache.jackrabbit.core.cluster.ClusterNode.run() @bci=38, line=254 (Interpreted frame) - java.lang.Thread.run() @bci=11, line=595 (Interpreted frame) thread 2 Thread 25137: (state = BLOCKED) - org.apache.commons.collections.map.AbstractHashedMap.get(java.lang.Object) @bci=62, line=182 (Compiled frame; information may be imprecise) - org.apache.jackrabbit.core.state.NodeState.getReorderedChildNodeEntries() @bci=57, line=671 (Compiled frame) - org.apache.jackrabbit.core.CachingHierarchyManager.nodesReplaced(org.apache.jackrabbit.core.state.NodeState) @bci=1, line=385 (Interpreted frame) - org.apache.jackrabbit.core.state.StateChangeDispatcher.notifyNodesReplaced(org.apache.jackrabbit.core.state.NodeState) @bci=29, line=132 (Interpreted frame) - org.apache.jackrabbit.core.state.SessionItemStateManager.nodesReplaced(org.apache.jackrabbit.core.state.NodeState) @bci=29, line=874 (Interpreted frame) - org.apache.jackrabbit.core.state.NodeState.notifyNodesReplaced() @bci=12, line=793 (Interpreted frame) - org.apache.jackrabbit.core.state.NodeState.setChildNodeEntries(java.util.List) @bci=73, line=473 (Interpreted frame) - org.apache.jackrabbit.core.state.NodeStateMerger.merge(org.apache.jackrabbit.core.state.NodeState, org.apache.jackrabbit.core.state.NodeStateMerger$MergeContext) @bci=291, line=139 (Compiled frame) - org.apache.jackrabbit.core.state.SessionItemStateManager.stateModified(org.apache.jackrabbit.core.state.ItemState) @bci=58, line=802 (Interpreted frame) - org.apache.jackrabbit.core.state.StateChangeDispatcher.notifyStateModified(org.apache.jackrabbit.core.state.ItemState) @bci=29, line=85 (Interpreted frame) - org.apache.jackrabbit.core.state.LocalItemStateManager.stateModified(org.apache.jackrabbit.core.state.ItemState) @bci=49, line=427 (Interpreted frame) - org.apache.jackrabbit.core.state.StateChangeDispatcher.notifyStateModified(org.apache.jackrabbit.core.state.ItemState) @bci=29, line=85 (Interpreted frame) - org.apache.jackrabbit.core.state.SharedItemStateManager.stateModified(org.apache.jackrabbit.core.state.ItemState) @bci=5, line=390 (Interpreted frame) - org.apache.jackrabbit.core.state.ItemState.notifyStateUpdated() @bci=12, line=241 (Interpreted frame) - org.apache.jackrabbit.core.state.ChangeLog.persisted() @bci=30, line=271 (Interpreted frame) - org.apache.jackrabbit.core.state.SharedItemStateManager.doExternalUpdate(org.apache.jackrabbit.core.state.ChangeLog) @bci=264, line=945 (Interpreted frame) - org.apache.jackrabbit.core.state.SharedItemStateManager.externalUpdate(org.apache.jackrabbit.core.state.ChangeLog, org.apache.jackrabbit.core.observation.EventStateCollection) @bci=10, line=871 (Interpreted frame) - org.apache.jackrabbit.core.RepositoryImpl$WorkspaceInfo.externalUpdate(org.apache.jackrabbit.core.state.ChangeLog, java.util.List) @bci=25, line=1957 (Interpreted frame) - org.apache.jackrabbit.core.cluster.ClusterNode.end() @bci=182, line=834 (Interpreted frame) - org.apache.jackrabbit.core.cluster.ClusterNode.consume(org.apache.jackrabbit.core.journal.Record) @bci=469, line=929 (Compiled frame) - org.apache.jackrabbit.core.journal.AbstractJournal.doSync(long) @bci=108, line=191 (Compiled frame) - org.apache.jackrabbit.core.journal.AbstractJournal.lockAndSync() @bci=42, line=241 (Interpreted frame) - org.apache.jackrabbit.core.journal.DefaultRecordProducer.append() @bci=6, line=51 (Interpreted frame) - org.apache.jackrabbit.core.cluster.ClusterNode$WorkspaceUpdateChannel.updateCreated(org.apache.jackrabbit.core.cluster.Update) @bci=36, line=466 (Interpreted frame) - org.apache.jackrabbit.core.state.SharedItemStateManager$Update.begin() @bci=44, line=530 (Interpreted frame) - org.apache.jackrabbit.core.state.SharedItemStateManager.beginUpdate(org.apache.jackrabbit.core.state.ChangeLog, org.apache.jackrabbit.core.observation.EventStateCollectionFactory, org.apache.jackrabbit.core.virtual.VirtualItemStateProvider) @bci=15, line=825 (Interpreted frame) - org.apache.jackrabbit.core.state.SharedItemStateManager.update(org.apache.jackrabbit.core.state.ChangeLog, org.apache.jackrabbit.core.observation.EventStateCollectionFactory) @bci=4, line=855 (Interpreted frame) - org.apache.jackrabbit.core.state.LocalItemStateManager.update(org.apache.jackrabbit.core.state.ChangeLog) @bci=9, line=326 (Interpreted frame) - org.apache.jackrabbit.core.state.XAItemStateManager.update(org.apache.jackrabbit.core.state.ChangeLog) @bci=20, line=313 (Interpreted frame) - org.apache.jackrabbit.core.state.LocalItemStateManager.update() @bci=22, line=302 (Interpreted frame) - org.apache.jackrabbit.core.state.SessionItemStateManager.update() @bci=4, line=306 (Interpreted frame) - org.apache.jackrabbit.core.ItemImpl.save() @bci=594, line=1214 (Interpreted frame) - net.maven.mcr.event.AssetCompleteEventListener.markAssetComplete(javax.jcr.Node, boolean) @bci=137, line=185 (Interpreted frame) - net.maven.mcr.event.AssetCompleteEventListener.handleAssetCompleteCheck(java.lang.String) @bci=241, line=169 (Interpreted frame) - net.maven.mcr.event.AssetCompleteEventListener.onEvent(javax.jcr.observation.EventIterator) @bci=112, line=82 (Interpreted frame) - org.apache.jackrabbit.core.observation.EventConsumer.consumeEvents(org.apache.jackrabbit.core.observation.EventStateCollection) @bci=165, line=231 (Compiled frame) - org.apache.jackrabbit.core.observation.ObservationDispatcher.run() @bci=104, line=145 (Interpreted frame) - java.lang.Thread.run() @bci=11, line=595 (Interpreted frame) Since Thread 2 is blocked by JVM lock, it is also holding the select lock in doSync.getRecords. That explained the deadlock on database level. I am not sure these two problems are exactly the same, if not, I can file a seperate bug. Thanks. > Under Heavy load in a Cluster HTTP Threads Block and stall requests > ------------------------------------------------------------------- > > Key: JCR-929 > URL: https://issues.apache.org/jira/browse/JCR-929 > Project: Jackrabbit > Issue Type: Bug > Components: core > Affects Versions: 1.3 > Environment: 2 Node Cluster, OSX, JDK 1.5 with DatabaseJournal, DatabasePersistanceManager, all content in DB, using WebDAV to load > Reporter: Ian Boston > Assignee: Dominique Pfister > Attachments: catalina.out.node1.txt, catalina.out.node2.txt > > > Under Heavy load created by mounting both nodes in the cluster in OSX Finder and then uploading large numebers of files to each node at the same time ( a few 1000), eventually one of the nodes stops responding and the Finder mount timesout and disconnects. > Once that happens that node becomes unusable. > More mount attempts will prompt for a password indicating HTTP is still running, but will timeout once the connection is authenticated. > Access by the Web Browser will prompt for a password, conenct and provide a once only listing of any collection in the workspace. If you try to refresh that collection, the HTTP request hangs forever. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.