db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brett Bergquist (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DERBY-5552) Derby threads hanging when using ClientXADataSource and a deadlock or lock timeout occurs
Date Fri, 23 Dec 2011 00:19:31 GMT

    [ https://issues.apache.org/jira/browse/DERBY-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13175192#comment-13175192

Brett Bergquist commented on DERBY-5552:

I guess I am confused as well Kathey as I had the debugger attached and do see it going through
the XA code in Derby on the client side.  The application server is setup with the ClientXADataSource
and I do see it calling xa.commit and xa.end for example.   The ClientXADataSource is required
otherwise the error:

	Local transaction already has 1 non-XA Resource: cannot add more resources. 

occurs.  So although there is one database (Derby), it is using XA.   The database is being
accessed through EJB's and through Eclipselink and also through a custom JCA interface driving
Message Driven Beans.  

For the test case, I had to limit things to get my sanity.  So I stopped as much access to
the database as I could but still trigger the problem.  Eventually I got down to one thread
of control being processed by EJB's which do start new transactions.  Even with this one access
going on, I hit the lockup issue that I posted.  That is when I found the issue that I mention.
 So whether or not this is the real issue, I don't know but when I tried to get as simple
a condition as possible, I ran into this.

Thinking now, I don't understand why this would not be hit in a normal case of a lock timeout
being thrown. The only thing that I can think of is that the Activation.checkStatementValidity()
is seeing the statement as valid and not going to try to recompile it.  Why it occurred in
my case where I see the "isValid" member set to false, I don't know.  I will try to hitch
up the debugger and try to determine the difference so that I can understand it better.

I do believe that the code should not swallow and exception such as a lock timeout being reported
regardless if the statement is no longer reporting to be valid.  This is definitely a condition
that will cause an infinite loop of processing.

Again, I appreciate the help and your time.  If I gain an understanding of how the condition
is triggered, I will look to write a test case for it.  I am reading the Derby testing docs
that are relating to use JUnit which I assume is the correct path for newer test cases, correct?

> Derby threads hanging when using ClientXADataSource and a deadlock or lock timeout occurs
> -----------------------------------------------------------------------------------------
>                 Key: DERBY-5552
>                 URL: https://issues.apache.org/jira/browse/DERBY-5552
>             Project: Derby
>          Issue Type: Bug
>          Components: Network Server
>    Affects Versions:
>         Environment: Solaris 10, Glassfish V2.1.1,
>            Reporter: Brett Bergquist
>            Priority: Blocker
>         Attachments: appserverstack.txt, client.tar.Z, derby.log, derbystackatshutdown.txt,
execute.patch, transactionsleft.txt
> The issue arrives when multiple XA transactions are done in parallel and there is either
a lock timeout or a lock deadlock detected.  When this happens the connection is leaked in
the Glassfish connection pool and the client thread hangs in "org.apache.derby.client.netReply.fill(Reply.java:172)".
> Shutting down the app server fails because the thread has a lock in "org.apache.derby.client.net.NetConnection40"
and another task is calling "org.apache.derby.client.ClientPooledConnection.close(ClientPooledConnection.java:214)"
which is waiting for the lock.
> Killing the appsever using "kill" and then attempting to shutdown Derby network server
causes the Network Server to hang.  One of the threads hangs waiting for a lock at "org.apache.derby.impl.drda.NeworkServerControlImpl.removeFromSessionTable(NetworkServerControlImpl.java:1525)"
and the "main" thread has this locked at "org.apache.derby.impl.drda.NetworkServerControlImpl.executeWork(NetworkServerControlImpl.java:2242)"
and it itself is waiting for a lock which belongs to a thread that is stuck at "org.apache.derby.impl.services.locks.ActiveLock.waitForGrant(ActiveLock.java:118)
which is in the TIMED_WAITING state.
> Only by killing the Network Server using "kill" is possible at this point.
> There are transactions left even though all clients have been removed.  

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message