jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jukka Zitting (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (JCR-3738) CLONE - Deadlock on LOCAL_REVISION table in clustering environment
Date Fri, 28 Feb 2014 21:47:22 GMT

    [ https://issues.apache.org/jira/browse/JCR-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13916455#comment-13916455

Jukka Zitting commented on JCR-3738:

The key threads here seem to be (with non-essential stack frames excluded):

"pool-7-thread-2-Granite Workflow External Process Job Queue(com/adobe/granite/workflow/external/job/etc/workflow/models/dam/update_asset/jcr_content/model)"
daemon prio=10 tid=0x00007f12a46bf800 nid=0x2712 in Object.wait() [0x00007f125b4bf000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x00000000b4151370> (a org.apache.jackrabbit.core.state.DefaultISMLocking)
	at org.apache.jackrabbit.core.state.SharedItemStateManager.acquireWriteLock(SharedItemStateManager.java:1898)
	at org.apache.jackrabbit.core.state.SharedItemStateManager$Update.begin(SharedItemStateManager.java:579)
	at org.apache.jackrabbit.core.state.SharedItemStateManager.beginUpdate(SharedItemStateManager.java:1507)
	at org.apache.jackrabbit.core.state.SharedItemStateManager.update(SharedItemStateManager.java:1537)
	at org.apache.jackrabbit.core.SessionImpl.save(SessionImpl.java:812)

"pool-7-thread-1" daemon prio=10 tid=0x00007f12a46ea800 nid=0x2711 runnable [0x00007f12631b1000]
   java.lang.Thread.State: RUNNABLE
	at com.ibm.db2.jcc.am.qo.execute(qo.java:2724)
	- locked <0x00000000ba4b7e30> (a com.ibm.db2.jcc.t4.b)
	at org.apache.commons.dbcp.DelegatingPreparedStatement.execute(DelegatingPreparedStatement.java:172)
	at org.apache.jackrabbit.core.util.db.ConnectionHelper.exec(ConnectionHelper.java:288)
	at org.apache.jackrabbit.core.journal.DatabaseJournal$DatabaseRevision.set(DatabaseJournal.java:834)
	- locked <0x00000000b41afb40> (a org.apache.jackrabbit.core.journal.DatabaseJournal$DatabaseRevision)
	at org.apache.jackrabbit.core.cluster.ClusterNode.setRevision(ClusterNode.java:872)
	at org.apache.jackrabbit.core.cluster.ClusterNode$WorkspaceUpdateChannel.updateCommitted(ClusterNode.java:703)
	at org.apache.jackrabbit.core.state.SharedItemStateManager$Update.end(SharedItemStateManager.java:845)
	at org.apache.jackrabbit.core.state.SharedItemStateManager.update(SharedItemStateManager.java:1537)
	at org.apache.jackrabbit.core.SessionImpl.save(SessionImpl.java:812)

It looks like the first thread managed to acquire the cluster lock in the database as that's
done before the Java-level acquireWriteLock() call. But the second thread was already inside
that critical section (or somehow managed to enter it afterwards), and now isn't able to complete
the transaction because the database won't allow it.

I'll look into this in more detail next week.

> CLONE - Deadlock on LOCAL_REVISION table in clustering environment
> ------------------------------------------------------------------
>                 Key: JCR-3738
>                 URL: https://issues.apache.org/jira/browse/JCR-3738
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: clustering
>    Affects Versions: 2.6.2
>         Environment: CQ5.6.1 with jackrabbit-core 2.6.2 backed off ibm db2 v10.5
>            Reporter: Ankush Malhotra
>            Assignee: Jukka Zitting
>            Priority: Critical
>         Attachments: db-deadlock-info.txt, stat-cache.log, threaddumps.zip
> Original, cloned description:
> > When inserting a lot of nodes concurrently (100/200 threads) the system hangs generating
a deadlock on the LOCAL_REVISION table.
> > There is a thread that starts a transaction but the transaction remains open, while
another thread tries to acquire the lock on the table.
> > This actually happen even if there is only a server up but configured in cluster
> > I found that in AbstractJournal, we try to write the LOCAL_REVISION even if we don't
sync any record because they're generated by the same journal of the thread running.
> >
> > Removing this unnecessary (to me :-) ) write to the LOCAL_REVISION table, remove
the deadlock.
> This might not be the exact same case with this issue. See the attached thread dumps
etc. for full details.

This message was sent by Atlassian JIRA

View raw message