On 12/21/2011 12:04 PM, Bergquist, Brett wrote:
>
> Nothing in the Derby log other than it logging a deadlock with the
> statements and a lock timeout with its statements and it indicating
> that cleanup had started and completed.
>
> I will enable tracing with the documented (undocumented system
> property). Thanks for that information!
>
> I will check for the XA transactions the next time I reproduce this.
>
> Maybe you could point me into the correct area to look. This seems to
> be triggered either through a lock timeout or a deadlock. The
> connection that this is occurring through is an XA connection. I see
> the logging of this in the server log but I am trying to find out
> where that would be logged from. It seems after this occurs and
> because of the way connection pool is being validated and recreated on
> error by Glassfish (configured to do so), it gets into this state.
> What I don't understand is why this type of error would cause the
> connection to appear to be invalid and I am trying to work through
> both the Glassfish source and the Derby source to find out. The
> connection is correctly handling other errors such as a duplication
> trying to be inserted and this does not trigger the connection to
> appear to be invalid. So I am trying to understand why a lock
> timeout or deadlock detection might do so.
>
> This problem has only cropped up recently when they started performing
> multiple requests that I know have a deadlock path through them. I
> can fix that problem later but this is a system level problem that I
> need to resolve.
>
> I really do appreciate the help and guidance and am willing to try to
> work though this. I have to figure this out and either patch
> Glassfish or Derby in any case as my customer (think very very large
> wireless carrier) is getting pretty PO'ed.
>
The one thing I think of specifically with a deadlock is that it will
automatically rollback the victim transaction and that might throw off
this client logic regarding the state of the server. But I would
think if there were just a simple problem with deadlocks it would have
showed up before now. That said I don't see any specific tests in our XA
tests: org.apache.derbyTesting.functionTests.tests.jdbapi.XATest or
org.apache.derbyTesting.functionTests.tests.jdbcapi.XATransactionTest
that test XAConnections with deadlocks.
Is this a local transaction on an XA connection or a real XA
transaction with two phase commit?
You might want to try to test and an XAConnection with a simple
deadlock case locally to see if that pops a reproduction.
org.apache.derbyTesting.functionTests.tests.lang.DeadlockDetectionTest
and org.apache.derbyTesting.functionTests.tests.lang have some examples
of deadlocks.
HTH,
Kathey
|