Return-Path: Delivered-To: apmail-db-derby-commits-archive@www.apache.org Received: (qmail 34339 invoked from network); 23 Mar 2011 12:21:32 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 23 Mar 2011 12:21:32 -0000 Received: (qmail 14852 invoked by uid 500); 23 Mar 2011 12:21:32 -0000 Delivered-To: apmail-db-derby-commits-archive@db.apache.org Received: (qmail 14808 invoked by uid 500); 23 Mar 2011 12:21:32 -0000 Mailing-List: contact derby-commits-help@db.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: Reply-To: "Derby Development" List-Id: Delivered-To: mailing list derby-commits@db.apache.org Received: (qmail 14801 invoked by uid 99); 23 Mar 2011 12:21:32 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 Mar 2011 12:21:32 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO eris.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 Mar 2011 12:21:28 +0000 Received: by eris.apache.org (Postfix, from userid 65534) id D84BD23888CE; Wed, 23 Mar 2011 12:21:05 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: svn commit: r1084561 - /db/derby/code/trunk/java/engine/org/apache/derby/impl/services/locks/Deadlock.java Date: Wed, 23 Mar 2011 12:21:05 -0000 To: derby-commits@db.apache.org From: kahatlen@apache.org X-Mailer: svnmailer-1.0.8 Message-Id: <20110323122105.D84BD23888CE@eris.apache.org> X-Virus-Checked: Checked by ClamAV on apache.org Author: kahatlen Date: Wed Mar 23 12:21:05 2011 New Revision: 1084561 URL: http://svn.apache.org/viewvc?rev=1084561&view=rev Log: DERBY-3980: Conflicting select then update with REPEATABLE_READ gives lock timeout instead of deadlock DERBY-5073: Derby deadlocks without recourse on simultaneous correlated subqueries Added more comments describing the deadlock detection algorithm. Modified: db/derby/code/trunk/java/engine/org/apache/derby/impl/services/locks/Deadlock.java Modified: db/derby/code/trunk/java/engine/org/apache/derby/impl/services/locks/Deadlock.java URL: http://svn.apache.org/viewvc/db/derby/code/trunk/java/engine/org/apache/derby/impl/services/locks/Deadlock.java?rev=1084561&r1=1084560&r2=1084561&view=diff ============================================================================== --- db/derby/code/trunk/java/engine/org/apache/derby/impl/services/locks/Deadlock.java (original) +++ db/derby/code/trunk/java/engine/org/apache/derby/impl/services/locks/Deadlock.java Wed Mar 23 12:21:05 2011 @@ -38,33 +38,88 @@ import java.util.Stack; import java.util.List; /** - Code to support deadlock detection. -*/ + *

+ * Code to support deadlock detection. + *

+ * + *

+ * This class implements deadlock detection by searching for cycles in the + * wait graph. If a cycle is found, it means that (at least) two transactions + * are blocked by each other, and one of them must be aborted to allow the + * other one to continue. + *

+ * + *

+ * The wait graph is obtained by asking the {@code LockSet} instance to + * provide a map representing all wait relations, see {@link #getWaiters}. + * The map consists of two distinct sets of (key, value) pairs: + *

+ * + *
    + *
  1. (space, lock) pairs, where {@code space} is the compatibility space + * of a waiting transaction and {@code lock} is the {@code ActiveLock} + * instance on which the transaction is waiting
  2. + *
  3. (lock, prevLock) pairs, where {@code lock} is an {@code ActiveLock} and + * {@code prevLock} is the {@code ActiveLock} or {@code LockControl} for the + * first waiter in the queue behind {@code lock}
  4. + *
+ * + *

+ * The search is performed as a depth-first search starting from the lock + * request of a waiter that has been awoken for deadlock detection (either + * because {@code derby.locks.deadlockTimeout} has expired or because some + * other waiter had picked it as a victim in order to break a deadlock). + * From this lock request, the wait graph is traversed by checking which + * transactions have already been granted a lock on the object, and who they + * are waiting for. + *

+ * + *

+ * The state of the search is maintained by pushing compatibility spaces + * (representing waiting transactions) and granted locks onto a stack. When a + * dead end is found (that is, a transaction that holds locks without waiting + * for any other transaction), the stack is popped and the search continues + * down a different path. This continues until a cycle is found or the stack is + * empty. Detection of cycles happens when pushing a new compatibility space + * onto the stack. If the same space already exists on the stack, it means the + * graph has a cycle and we have a deadlock. + *

+ * + *

+ * When a deadlock is found, one of the waiters in the deadlock cycle is awoken + * and it will terminate itself, unless it finds that the deadlock has been + * broken in the meantime, for example because one of the involved waiters + * has timed out. + *

+ */ class Deadlock { private Deadlock() {} /** + *

* Look for a deadlock. - *
+ *

+ * + *

* Walk through the graph of all locks and search for cycles among * the waiting lock requests which would indicate a deadlock. A simple * deadlock cycle is where the granted locks of waiting compatibility * space A is blocking compatibility space B and space B holds locks causing * space A to wait. - *

- * Would be nice to get a better high level description of deadlock - * search. + *

+ * *

* MT - if the LockTable is a LockSet object, the * callers must be synchronized on the LockSet object in order - * to satisfy the syncronization requirements of + * to satisfy the synchronization requirements of * LockSet.addWaiters(). If it is a * ConcurrentLockSet object, the callers must not hold any of * the ReentrantLocks guarding the entries in the lock table, * and the callers must make sure that only a single thread calls * look() at a time. + *

* * * @param factory The locking system factory @@ -107,16 +162,22 @@ class Deadlock { outer: for (;;) { if (chain.isEmpty()) { - // all done + // All paths from the initial waiting lock request have been + // examined without finding a deadlock. We're done. break outer; } List grants = (List) chain.peek(); if (grants.isEmpty()) { + // All granted locks in this lock control have been examined. // pop this list of granted locks and back to the previous one rollback(chain); continue outer; } + + // Pick one of the granted lock for examination. rollback() + // expects us to have examined the last one in the list, so + // always pick that one. int endStack = grants.size() - 1; Object space = ((Lock) grants.get(endStack)).getCompatabilitySpace(); @@ -135,22 +196,45 @@ outer: for (;;) { inner: for (;;) { int index = chain.indexOf(space); if (index != -1) { - - // We could be seeing a situation here like + // Oops... The space has been examined once before, so + // we have what appears to be a cycle in the wait graph. + // In most cases this means we have a deadlock. + // + // However, in some cases, the cycle in the graph may be + // an illusion. For example, we could have a situation + // here like this: + // // Granted T1{S}, T2{S} // Waiting T1{X} - deadlock checking on this // - // In this case it's not a deadlock, although it - // depends on the locking policy of the Lockable. E.g. - // Granted T1(latch) - // Waiting T1(latch) - // is a deadlock. - // + // In this case it's not necessarily a deadlock. If the + // Lockable returns true from its lockerAlwaysCompatible() + // method, which means that lock requests within the same + // compatibility space never conflict with each other, + // T1 is only waiting for T2 to release its shared lock. + // T2 isn't waiting for anyone, so there is no deadlock. + // + // This is only true if T1 is the first one waiting for + // a lock on the object. If there are other waiters in + // between, we have a deadlock regardless of what + // lockerAlwaysCompatible() returns. Take for example this + // similar scenario, where T3 is also waiting: + // + // Granted T1{S}, T2{S} + // Waiting T3{X} + // Waiting T1{X} - deadlock checking on this + // + // Here, T1 is stuck behind T3, and T3 is waiting for T1, + // so we have a deadlock. if ((index == (chain.size() - 1)) || ((index == (chain.size() - 2)) && (index == (chain.indexOf(grants) - 1)))) { + // The two identical compatibility spaces were right + // next to each other on the stack. This means we have + // the first scenario described above, with the first + // waiter already having a lock on the object. It is a // potential self deadlock, but probably not! ActiveLock lock = (ActiveLock) waiters.get(space); @@ -163,14 +247,20 @@ inner: for (;;) { } } + // So it wasn't an illusion after all. Pick a victim. return Deadlock.handle(factory, chain, index, waiters, deadlockWake); } + + // Otherwise... The space hasn't been examined yet, so put it + // on the stack and start examining it. chain.push(space); skip_space: while (true) { + // Who is this space waiting for? Lock waitingLock = (Lock) waiters.get(space); if (waitingLock == null) { + // The space isn't waiting for anyone, so we're at the // end of the road, no deadlock in this path // pop items until the previous Stack rollback(chain); @@ -196,25 +286,44 @@ inner: for (;;) { continue outer; } + // Push all the granted locks on this object onto the + // stack, and go ahead examining them one by one. chain.push(waitOnControl.getGrants()); - continue outer; } else { // simply waiting on another waiter ActiveLock waitOnLock = (ActiveLock) waitOn; + // Set up the next space for examination. space = waitOnLock.getCompatabilitySpace(); + // Now, there is a possibility that we're not actually + // waiting behind the other other waiter. Take for + // example this scenario: + // + // Granted T1{X} + // Waiting T2{S} + // Waiting T3{S} - deadlock checking on this + // + // Here, T3 isn't blocked by T2. As soon as T1 releases + // its X lock on the object, both T2 and T3 will be + // granted an S lock. And if T1 also turns out to be + // blocked by T3 and we have a deadlock, aborting T2 + // won't resolve the deadlock, so it's not actually + // part of the deadlock. If we have this scenario, we + // just skip past T2's space and consider T3 to be + // waiting on T1 directly. + if (waitingLock.getLockable().requestCompatible( waitingLock.getQualifier(), waitOnLock.getQualifier())) { - // We're behind another waiter in the queue, but we - // request compatible locks, so we'll get the lock - // too once it gets it. Since we're not actually - // blocked by the waiter, skip it and see what's - // blocking it instead. + // We're behind another waiter with a compatible + // lock request. Skip it since we're not really + // blocked by it. continue skip_space; } else { + // We are really blocked by the other waiter. Go + // ahead and investigate its compatibility space. continue inner; } } @@ -225,6 +334,14 @@ inner: for (;;) { return null; } + /** + * Backtrack in the depth-first search through the wait graph. Expect + * the top of the stack to hold the compatibility space we've just + * investigated. Pop the stack until the most recently examined granted + * lock has been removed. + * + * @param chain the stack representing the state of the search + */ private static void rollback(Stack chain) { do { chain.pop(); @@ -237,12 +354,32 @@ inner: for (;;) { grants.remove(grants.size() - 1); } + /** + * Get all the waiters in a {@code LockTable}. The waiters are returned + * as pairs (space, lock) mapping waiting compatibility spaces to the + * lock request in which they are blocked, and (lock, prevLock) linking + * a lock request with the lock request that's behind it in the queue of + * waiters. + * + * @param set the lock table + * @return all waiters in the lock table + * @see LockControl#addWaiters(java.util.Map) + */ private static Hashtable getWaiters(LockTable set) { Hashtable waiters = new Hashtable(); set.addWaiters(waiters); return waiters; } + /** + * Handle a deadlock when it has been detected. Find out if the waiter + * that started looking for the deadlock is involved in it. If it isn't, + * pick a victim among the waiters that are involved. + * + * @return {@code null} if the waiter that started looking for the deadlock + * isn't involved in the deadlock (in which case another victim will have + * been picked and awoken), or an array describing the deadlock otherwise + */ private static Object[] handle(AbstractPool factory, Stack chain, int start, Dictionary waiters, byte deadlockWake) { @@ -291,6 +428,14 @@ inner: for (;;) { } + /** + * Build an exception that describes a deadlock. + * + * @param factory the lock factory requesting the exception + * @param data an array with information about who's involved in + * a deadlock (as returned by {@link #handle}) + * @return a deadlock exception + */ static StandardException buildException(AbstractPool factory, Object[] data) {