jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Davide Maestroni (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (JCR-3597) Dead lock when using LockManager session scoped locks on nodes
Date Fri, 17 May 2013 13:43:18 GMT

     [ https://issues.apache.org/jira/browse/JCR-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Davide Maestroni updated JCR-3597:

    Attachment: thread_dump.txt

Thread dump logs file
> Dead lock when using LockManager session scoped locks on nodes
> --------------------------------------------------------------
>                 Key: JCR-3597
>                 URL: https://issues.apache.org/jira/browse/JCR-3597
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: jackrabbit-core
>    Affects Versions: 2.6.1
>         Environment: Sling + Jackrabbit
>            Reporter: Davide Maestroni
>            Priority: Blocker
>         Attachments: thread_dump.txt
> I have a repository created using Jackrabbit and accessed through HTTP APIs via Sling.
I use the LockManager class to create locks on nodes before adding new children and updating
properties. When testing the speed of the system I ran across an issue when using different
threads to add child nodes to different parent nodes: a dead lock happens quite consistently
when one thread is logging out from a JCR session and another one is saving its own session
(I'll attach the thread dump to the JIRA issue).
> So I started inspecting the source code, trying to understand what the problem was, and
I think I located the problem. Basically it all happens when calling the SharedItemStateManager.beginUpdate()
method: one thread, inside the Update.begin(), acquires the synchronized lock on the LocalItemStateManager
instance and wait for the ISMLocking lock to be released (Thread t@295 in the logs); while
the other, which owns the ISMLocking lock, inside the Update.end() triggers an event which
in turn try to synchronize with the LocalItemStateManager instance (Thread t@293 in the logs).
> I noticed that the LocalItemStateManager implementation employs synchronized blocks only
in two instances: one in the edit() method to synchronize the whole function, and one in the
getItemState() method to synchronize the access to the cache object. Both are quite odd since:
the edit() is the only synchronized method and there are several other places in which editMode
and changeLog are modified, so it is quite difficult to understand the use of it; the cache
block to be atomic can simply employ a dedicated object to synchronize on instead of locking
the whole instance.
> Another suspicious piece of code is inside the LockManagerImpl class, where there are
two blocks of code synchronized on the LocalItemStateManager instance. Again, since no method
of the class, with the exception of the edit(), is synchronized, nothing prevents another
thread to call the LocalItemStateManager instance at the same time.
> I finally identified the root cause of my issue in the "synchronized (this)" (line 170)
inside the LocalItemStateManager.getItemState() method and, since the comment states that
the lock is there just to ensure the calls to the cache to be atomically executed, I tried
to add a simple object field name "monitor" to the class and use "synchronized (monitor)"
instead. The patch looks to solve the dead lock, still I am confused about the pieces of code
mentioned above.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message