jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Davide Maestroni (JIRA)" <j...@apache.org>
Subject [jira] [Created] (JCR-3597) Dead lock when using LockManager session scoped locks on nodes
Date Fri, 17 May 2013 13:43:16 GMT
Davide Maestroni created JCR-3597:
-------------------------------------

             Summary: Dead lock when using LockManager session scoped locks on nodes
                 Key: JCR-3597
                 URL: https://issues.apache.org/jira/browse/JCR-3597
             Project: Jackrabbit Content Repository
          Issue Type: Bug
          Components: jackrabbit-core
    Affects Versions: 2.6.1
         Environment: Sling + Jackrabbit
            Reporter: Davide Maestroni
            Priority: Blocker
         Attachments: thread_dump.txt

I have a repository created using Jackrabbit and accessed through HTTP APIs via Sling. I use
the LockManager class to create locks on nodes before adding new children and updating properties.
When testing the speed of the system I ran across an issue when using different threads to
add child nodes to different parent nodes: a dead lock happens quite consistently when one
thread is logging out from a JCR session and another one is saving its own session (I'll attach
the thread dump to the JIRA issue).
So I started inspecting the source code, trying to understand what the problem was, and I
think I located the problem. Basically it all happens when calling the SharedItemStateManager.beginUpdate()
method: one thread, inside the Update.begin(), acquires the synchronized lock on the LocalItemStateManager
instance and wait for the ISMLocking lock to be released (Thread t@295 in the logs); while
the other, which owns the ISMLocking lock, inside the Update.end() triggers an event which
in turn try to synchronize with the LocalItemStateManager instance (Thread t@293 in the logs).
I noticed that the LocalItemStateManager implementation employs synchronized blocks only in
two instances: one in the edit() method to synchronize the whole function, and one in the
getItemState() method to synchronize the access to the cache object. Both are quite odd since:
the edit() is the only synchronized method and there are several other places in which editMode
and changeLog are modified, so it is quite difficult to understand the use of it; the cache
block to be atomic can simply employ a dedicated object to synchronize on instead of locking
the whole instance.
Another suspicious piece of code is inside the LockManagerImpl class, where there are two
blocks of code synchronized on the LocalItemStateManager instance. Again, since no method
of the class, with the exception of the edit(), is synchronized, nothing prevents another
thread to call the LocalItemStateManager instance at the same time.
I finally identified the root cause of my issue in the "synchronized (this)" (line 170) inside
the LocalItemStateManager.getItemState() method and, since the comment states that the lock
is there just to ensure the calls to the cache to be atomically executed, I tried to add a
simple object field name "monitor" to the class and use "synchronized (monitor)" instead.
The patch looks to solve the dead lock, still I am confused about the pieces of code mentioned
above.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message