Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm
Precedence: bulk
Reply-To: <derby-dev@db.apache.org>
Date: Sun, 24 Jun 2012 03:52:42 +0000 (UTC)
From: "Brett Wooldridge (JIRA)" <jira@apache.org>
To: derby-dev@db.apache.org
Message-ID: <1969405102.48458.1340509962684.JavaMail.jiratomcat@issues-vm>
Subject: [jira] [Commented] (DERBY-4279) Statement cache deadlock
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/DERBY-4279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400091#comment-13400091 ] 

Brett Wooldridge commented on DERBY-4279:
-----------------------------------------

I've been looking at 4279 again today...

..and thinking of possible solutions, when a question arose.  First and foremost, the deadlock is caused by the fact that preparing a statement requires a table lock (shared) in Derby.  Why is this, technically?  If the requirement that a table lock is needed to prepare a statement can be removed, this deadlock can be
fixed.

Alternatively, if the requirement that a table lock is needed cannot be removed, a possible resolution for 4279 is to remove the concept that prepared statements are shared across connections and instead make the statement cache per-connection.  While this increases the memory overhead slightly -- I have to believe that the artifacts of a prepared statement are in fact extremely small -- it removes a lot of shared-cache synchronization code and probably increases concurrency in general.  If you've been in that code, the synchronization is pretty hairy (as you can see from the comments in 4279 as well) and there are synchronization blocks in there but commented out for reasons no existing developers can explain.

In fact, now that I think of it, it would be great if the requirement for a table lock could be removed when preparing a statement AND the cache made per-connection (to simplify the code to a point that humans can understand).

I understand there is probably an edge case whereby performance would be degraded compared to existing code -- that being a scenario in which connections are created and discarded frequently.  But that is a scenario easily solved by connection re-use, either explicit or by use of a connection pool.

Thoughts?  I'm willing to put in some work if either of these approaches is acceptable.  I already put in considerable time on 4279 over a year ago, but eventually abandoned it (as you can see in the comments) due to synchronization issues in the shared cache.

                
> Statement cache deadlock
> ------------------------
>
>                 Key: DERBY-4279
>                 URL: https://issues.apache.org/jira/browse/DERBY-4279
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.0.2.1, 10.1.3.1, 10.2.2.0, 10.3.3.0, 10.4.2.0, 10.5.1.1, 10.8.1.2
>         Environment: Windows Vista, OS X 10.5+
>            Reporter: Jeff Stuckman
>              Labels: derby_triage10_5_2
>         Attachments: Derby4279.java, client_stacktrace_activation_closed.txt, patch4279.txt, stacktrace.txt
>
>
> Due to a design flaw in the statement cache, a deadlock can occur if a prepared statement becomes out-of-date.
> I will illustrate this with the following example:
> The application is using the embedded Derby driver. The application has two threads, and each thread uses its own connection.
> There is a table named MYTABLE with column MYCOLUMN.
> 1. A thread prepares and executes the query SELECT MYCOLUMN FROM MYTABLE. The prepared statement is stored in the statement cache (see org.apache.derby.impl.sql.GenericStatement for this logic)
> 2. After some time, the prepared statement becomes invalid or out-of-date for some reason (see org.apache.derby.impl.sql.GenericPreparedStatement)
> 3. Thread 1 begins a transaction and executes LOCK TABLE MYTABLE IN EXCLUSIVE MODE
> 4. Thread 2 begins a transaction and executes SELECT MYCOLUMN FROM MYTABLE. The statement is in the statement cache but it is out-of-date. The thread begins to recompile the statement. To compile the statement, the thread needs a shared lock on MYTABLE. Thread 1 already has an exclusive lock on MYTABLE. Thread 2 waits.
> 5. Thread 1 executes SELECT MYCOLUMN FROM MYTABLE. The statement is in the statement cache but it is being compiled. Thread 1 waits on the statement's monitor.
> 6. We have a deadlock. Derby eventually detects a lock timeout, but the error message is not descriptive. The stacks at the time of the deadlock are:
> This deadlock is unique because it can still occur in a properly designed database. You are only safe if all of your transactions are very simple and cannot be interleaved in a sequence that causes the deadlock, or if your particular statements do not require a table lock to compile. (For the sake of simplicity, I used LOCK TABLE in my example, but any UPDATE statement would fit.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira