db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bergquist, Brett" <BBergqu...@canoga.com>
Subject RE: Problem with a deadlock with Derby 10.8.1.2 and Glassfish V2.1.1
Date Thu, 22 Dec 2011 16:12:47 GMT
I opened DERBY-5552

https://issues.apache.org/jira/browse/DERBY-5552

I attached client side traces using traceLevel=2145 (TRACE_XA_CALLS|TRACE_PROTOCOL_FLOWS|TRACE_CONNECTS|TRACE_CONNECTION_CALL).
  I don't know if more is needed.

I have sever side traces but it is 93Mb uncompressed and 13Mb compressed.  Is there something
I should look for in there to narrow down which trace files to include and upload?

I have attached jstack traces of the application server and the Derby Network Server at shutdown
showing the hung threads.

I have attached a the output of "select * from syscs_diag.transaction_table" when there are
no clients and no other database action showing transactions that are still present.

I am trying to narrow down a test case better but have not been able to at this point.   
This is repeatable with my J2EE application every time however with the test setup that I
have.   Any further areas to look at with a debugger or outputting more tracing information
will be greatly appreciated.

From: Katherine Marsden [mailto:kmarsdenderby@sbcglobal.net]
Sent: Wednesday, December 21, 2011 7:25 PM
To: derby-dev@db.apache.org
Subject: Re: Problem with a deadlock with Derby 10.8.1.2 and Glassfish V2.1.1

On 12/21/2011 3:14 PM, Bergquist, Brett wrote:
Will get to this tomorrow but I do see one comment in the code that I don't understand:

In DRDAConnThread.java, I see:

                                if (severity > CodePoint.SVRCOD_ERROR)
                                {
                                                // For a session ending error > CodePoint.SRVCOD_ERROR
you cannot
                                                // send a SQLERRRM. A CMDCHKRM is required.
 In XA if there is a
                                                // lock timeout it ends the whole session.
I am not sure this
                                                // is the correct behaviour but if it occurs
we have to send
                                                // a CMDCHKRM instead of SQLERRM
                                                writeCMDCHKRM(severity);
                                }

So what does the comment "In XA if there is a lock timeout it ends the whole session" refer
to.  Why would a lock timeout be any different than any other standard database error.  It
is like this is hinting at what is happening.

This is a real XA transaction.

What I see is that after the timeout is hit (I see it hit in Timeout.java) the error is propagated
to the app server.  The app server then attempts to get the error text (I don't have the code
handy) which attempts to send a request back to the Derby.  This then fails with a No Connection
error being returned back from Derby.  It is as if after this error, the connection between
the app server and Derby is no longer once there this is hit.
I agree that would not be the correct behavior if a lock timeout killed  the session. As this
is a server side comment it would imply that this is a problem with embedded as well as well,
but hard to believe it would not have been exposed before now. Thanks for working on reproduction
for this.  I don't see the comment in the original code import but the annotation is not clear
as it mentions the back out of another fix, so I am  not sure who  first noticed this behavior.

Mime
View raw message