jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bart van der Schans (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (JCR-3440) Deadlock on LOCAL_REVISION table in clustering environment
Date Mon, 15 Oct 2012 13:44:04 GMT

    [ https://issues.apache.org/jira/browse/JCR-3440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13476145#comment-13476145

Bart van der Schans commented on JCR-3440:

Hi Luca,

>From what I see in the code the read operations also start a batched operation which I
don't think is needed. This was introduced (by accident) as a fix for reading large journals
in JCR-2832 by Jukka. This lead to problems that the batchmode was started twice: once in
the doLock() method, called in internalLockandSync (for write operations) and by the doSync(long
startRevision) in the DatabaseJournal. The doSync(long startRevision) is also called for the
read only operations by the internalSync() method. Hence since then all read operations start
transactions as well.

The double transaction start was fixed in JCR-2882, but the actual fix should have been to
not start a batch mode in the doSync(long startRevision) at all.

Could you try the attached patch and check if you can still reproduce the deadlock?

> Deadlock on LOCAL_REVISION table in clustering environment
> ----------------------------------------------------------
>                 Key: JCR-3440
>                 URL: https://issues.apache.org/jira/browse/JCR-3440
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: clustering
>    Affects Versions: 2.4.3
>         Environment: Env.1: 4x Linux server CentOS 5 MSSQL 2008 database (production
> Env.2: 2x Linux Ubuntu 10.04 server tested with PostgreSQL 9.1, H2, MSSQL 2008 and mySQL
5.5 (lab system)
>            Reporter: Luca Tagliani
>            Priority: Critical
>         Attachments: fixAlwaysBatchMode.patch, JCR-3440.patch, threadDump-JCR-3440.txt
> When inserting a lot of nodes concurrently (100/200 threads) the system hangs generating
a deadlock on the LOCAL_REVISION table.
> There is a thread that starts a transaction but the transaction remains open, while another
thread tries to acquire the lock on the table.
> This actually happen even if there is only a server up but configured in cluster mode.
> I found that in AbstractJournal, we try to write the LOCAL_REVISION even if we don't
sync any record because they're generated by the same journal of the thread running.
> Removing this unnecessary (to me :-) ) write to the LOCAL_REVISION table, remove the

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message