db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Knut Anders Hatlen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-5073) Derby deadlocks without recourse on simultaneous correlated subqueries
Date Thu, 10 Mar 2011 16:07:59 GMT

    [ https://issues.apache.org/jira/browse/DERBY-5073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005128#comment-13005128

Knut Anders Hatlen commented on DERBY-5073:

The code is in Deadlock.handle():

			// See if the checker is in the deadlock and we
			// already picked as a victim
			if ((checker.equals(space)) && (deadlockWake == Constants.WAITING_LOCK_DEADLOCK))
				victim = checker;

It never kicks in, and instead it goes further down in the method and wakes another victim:

		ActiveLock victimLock = (ActiveLock) waiters.get(victim);


The new victim wakes up from it's waiting state in ActiveLock.waitForGrant()/ConcurrentLockSet.lockObject(),
calls checkDeadlock() and ends up Deadlock.handle() again.

I think the problem may be caused by the following piece of code in Deadlock.look():

				} else {
					// simply waiting on another waiter
					space = waitingLock.getCompatabilitySpace();

As far as I can see, this code doesn't make any sense. space will already have the same value
as waitingLock.getCompatabilitySpace(), so the operation is actually a no-op. (waitingLock
is obtained by calling waiters.get(space), and the waiters Map is built up by (waitingLock.getCompatabilitySpace(),
waitingLock) value pairs, see LockControl.addWaiters().) Furthermore, this leads to "space"
being considered twice in a row by the deadlock detection, so that it thinks that the transaction
owning that compatibility space is waiting for one of its own locks. It therefore detects
the deadlock prematurely, and before it has seen all transactions involved in it, and incorrectly
concludes that the original victim wasn't involved.

By changing that last piece of code from a no-op to actually moving one step ahead in the
wait graph, the repro does fail with a deadlock error. That is, change the assignment to:

    space = ((ActiveLock) waitOn).getCompatabilitySpace();

I tried running the regression tests with that change, and they all passed. I do find the
deadlock detection code a bit hard to follow, so I'm not totally convinced this is the right

> Derby deadlocks without recourse on simultaneous correlated subqueries
> ----------------------------------------------------------------------
>                 Key: DERBY-5073
>                 URL: https://issues.apache.org/jira/browse/DERBY-5073
>             Project: Derby
>          Issue Type: Bug
>          Components: Services
>    Affects Versions:,,,,,,,,
>            Reporter: Karl Wright
>         Attachments: Derby5073.java
> When the following two queries are run against tables that contain the necessary fields,
using multiple threads, Derby deadlocks and none of the queries ever returns.  Derby apparently
detects no deadlock condition, either.
> SELECT t0.* FROM jobqueue t0 WHERE EXISTS(SELECT 'x' FROM carrydown t1 WHERE t1.parentidhash
IN (?) AND t1.childidhash=t0.dochash AND t0.jobid=t1.jobid) AND t0.jobid=?
> SELECT t0.* FROM jobqueue t0 WHERE EXISTS(SELECT 'x' FROM carrydown t1 WHERE t1.parentidhash
IN (?) AND t1.childidhash=t0.dochash AND t0.jobid=t1.jobid AND t1.newField=?) AND t0.jobid=?
> This code comes from Apache ManifoldCF, and has occurred when there are five or more
threads trying to execute these two queries at the same time.  Originally we found this on  It was hoped that would fix the problem, but it hasn't.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message