Return-Path: X-Original-To: apmail-db-derby-dev-archive@www.apache.org Delivered-To: apmail-db-derby-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 48605990D for ; Wed, 28 Dec 2011 15:49:55 +0000 (UTC) Received: (qmail 32675 invoked by uid 500); 28 Dec 2011 15:49:55 -0000 Delivered-To: apmail-db-derby-dev-archive@db.apache.org Received: (qmail 32651 invoked by uid 500); 28 Dec 2011 15:49:55 -0000 Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: Delivered-To: mailing list derby-dev@db.apache.org Received: (qmail 32639 invoked by uid 99); 28 Dec 2011 15:49:55 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Dec 2011 15:49:55 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [213.199.154.209] (HELO AM1EHSOBE006.bigfish.com) (213.199.154.209) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Dec 2011 15:49:45 +0000 Received: from mail41-am1-R.bigfish.com (10.3.201.247) by AM1EHSOBE006.bigfish.com (10.3.204.26) with Microsoft SMTP Server id 14.1.225.23; Wed, 28 Dec 2011 15:49:02 +0000 Received: from mail41-am1 (localhost [127.0.0.1]) by mail41-am1-R.bigfish.com (Postfix) with ESMTP id 176532C018D for ; Wed, 28 Dec 2011 15:48:05 +0000 (UTC) X-SpamScore: 0 X-BigFish: VPS0(zzc85fhzz1202hzz8275bh8275dhz2dh2a8h668h839h) X-Forefront-Antispam-Report: CIP:74.62.37.82;KIP:(null);UIP:(null);IPV:NLI;H:CPHUB1.canoga.com;RD:rrcs-74-62-37-82.west.biz.rr.com;EFVD:NLI X-FB-SS: 13, Received: from mail41-am1 (localhost.localdomain [127.0.0.1]) by mail41-am1 (MessageSwitch) id 1325087284352545_14870; Wed, 28 Dec 2011 15:48:04 +0000 (UTC) Received: from AM1EHSMHS015.bigfish.com (unknown [10.3.201.243]) by mail41-am1.bigfish.com (Postfix) with ESMTP id 4788E3C0046 for ; Wed, 28 Dec 2011 15:48:04 +0000 (UTC) Received: from CPHUB1.canoga.com (74.62.37.82) by AM1EHSMHS015.bigfish.com (10.3.207.153) with Microsoft SMTP Server (TLS) id 14.1.225.23; Wed, 28 Dec 2011 15:49:01 +0000 Received: from vserver1.canoga.com ([169.254.2.87]) by CPHUB1.canoga.com ([172.16.1.93]) with mapi; Wed, 28 Dec 2011 07:50:55 -0800 From: "Bergquist, Brett" To: "derby-dev@db.apache.org" Date: Wed, 28 Dec 2011 07:49:18 -0800 Subject: Could someone give me some guidance on DERBY-5560 Thread-Topic: Could someone give me some guidance on DERBY-5560 Thread-Index: AczFeEHxhiUP2ff/S/a/l04GO+tHMw== Message-ID: <97EB699F861AD841B5908C7CA9C9565601CC3A395AFF@VSERVER1.canoga.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US x-tm-as-product-ver: SMEX-8.0.0.1307-6.500.1024-18610.005 x-tm-as-result: No--47.937400-0.000000-31 x-tm-as-user-approved-sender: Yes x-tm-as-user-blocked-sender: No Content-Type: multipart/alternative; boundary="_000_97EB699F861AD841B5908C7CA9C9565601CC3A395AFFVSERVER1can_" MIME-Version: 1.0 X-OriginatorOrg: canoga.com X-Virus-Checked: Checked by ClamAV on apache.org --_000_97EB699F861AD841B5908C7CA9C9565601CC3A395AFFVSERVER1can_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable I created https://issues.apache.org/jira/browse/DERBY-5560 and am seeing th= is in production. Basically what is happening is that the LogicalConnection.close() is being = called which attempts to recycle the physical connection by calling ClientP= ooledConnection.recycleConnection(). At the same time ClientPooledConnecti= on.close() is being called which attempts to call LogicalConnection.nullPhy= sicalConnection(). The first thread holds a lock on LogicalConnection an= d needs the lock on ClientPooledConnection and the second thread holds a lo= ck on ClientPolledConnection and needs a lock on LogicalConnection and a de= adlock occurs. This is occurring because of the configuration of the connection pool in us= e (ClienXADataSource) and the pool is configured to close all connections o= n any error. The stack trace of the deadlock as a transaction being commi= tted by the first thread and the connection pool closing all threads on a d= etected error in the second thread. I don't see immediately how to synchronize in an orderly way to eliminate t= he deadlock. The first thread has a handle on the LogicalConnection which = references the physical ClientPooledConnection. The second thread has a ha= ndle on the physical ClientPooledConnection which references the LogicalCon= nection. One thought is to change the LogicalConnection.close to be something like: public void close() throws SQLException { synchronized(phsyicalConnection) { _close(); } } synchronized public void _close() throws SQLException { try { // we also need to loop thru all the logicalStatements and clos= e them if (physicalConnection_ =3D=3D null) { return; } if (physicalConnection_.agent_.loggingEnabled()) { physicalConnection_.agent_.logWriter_.traceEntry(this, "clo= se"); } if (physicalConnection_.isClosed()) // connection is closed or = has become stale { pooledConnection_.informListeners(new SqlException(null, new ClientMessageId( SQLState.PHYSICAL_CONNECTION_ALREADY_CLOSED))); } else { physicalConnection_.checkForTransactionInProgress(); physicalConnection_.closeForReuse( pooledConnection_.isStatementPoolingEnabled()); if (!physicalConnection_.isGlobalPending_()) { pooledConnection_.recycleConnection(); } } physicalConnection_ =3D null; pooledConnection_.nullLogicalConnection(); } catch ( SqlException se ) { throw se.getSQLException(); } } but this has a problem if "physicalConnection" is already null. Any guidance will be greatly appreciated as I need to patch Derby 10.8.2.2 = to work around this issue in any case. Brett --_000_97EB699F861AD841B5908C7CA9C9565601CC3A395AFFVSERVER1can_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

I created https://issues.apach= e.org/jira/browse/DERBY-5560 and am seeing this in production. &nb= sp; 

 

Basically what is happening is that the LogicalConnection.clos= e() is being called which attempts to recycle the physical connection by ca= lling ClientPooledConnection.recycleConnection().  At the same time Cl= ientPooledConnection.close() is being called which attempts to call Logical= Connection.nullPhysicalConnection().    The first thread hol= ds a lock on LogicalConnection and needs the lock on ClientPooledConnection= and the second thread holds a lock on ClientPolledConnection and needs a l= ock on LogicalConnection and a deadlock occurs.

 

This is occurring because= of the configuration of the connection pool in use (ClienXADataSource) and= the pool is configured to close all connections on any error.   = The stack trace of the deadlock as a transaction being committed by the fir= st thread and the connection pool closing all threads on a detected error i= n the second thread.

 

I don’t see immediately how to synchronize in = an orderly way to eliminate the deadlock.  The first thread has a hand= le on the LogicalConnection which references the physical ClientPooledConne= ction.  The second thread has a handle on the physical ClientPooledCon= nection which references the LogicalConnection.

 

One thought is to change = the LogicalConnection.close to be something like:

 

    publ= ic void close() throws SQLException {

&n= bsp;      synchronized(phsyicalConnection) {<= /o:p>

       &nb= sp;        _close();

       }

  }

 

   synchronized public void _close() th= rows SQLException {

   &n= bsp;    try

  &= nbsp;     {

 &n= bsp;          // we also need = to loop thru all the logicalStatements and close them

          &= nbsp; if (physicalConnection_ =3D=3D null) {

           &nbs= p;    return;

  = ;          }

         &nb= sp;  if (physicalConnection_.agent_.loggingEnabled()) {

=

         =        physicalConnection_.agent_.logWriter_.= traceEntry(this, "close");

&nb= sp;           }

 

 &= nbsp;          if (physicalCon= nection_.isClosed()) // connection is closed or has become stale=

        &n= bsp;   {

   &nb= sp;            poole= dConnection_.informListeners(new SqlException(null,

          &n= bsp;         new ClientMessage= Id(

       = ;            &n= bsp;    SQLState.PHYSICAL_CONNECTION_ALREADY_CLOSED)));=

      &nb= sp;     } else {

&nb= sp;            =    physicalConnection_.checkForTransactionInProgress();

        &= nbsp;       physicalConnection_.closeForReuse= (

      &n= bsp;            = ;     pooledConnection_.isStatementPoolingEnabled());

       = ;         if (!physicalConnection_.= isGlobalPending_()) {

   =             &nb= sp;    pooledConnection_.recycleConnection();

=

         =        }

&= nbsp;           }

        = ;    physicalConnection_ =3D null;

           = ; pooledConnection_.nullLogicalConnection();

        }

        catch ( SqlException s= e )

      =   {

     &= nbsp;      throw se.getSQLException();<= /p>

        }

    }

 

but this has a problem = if “physicalConnection” is already null.

Any guidance will be greatly appreciated as I need to patch D= erby 10.8.2.2 to work around this issue in any case.

 

Brett

=
= --_000_97EB699F861AD841B5908C7CA9C9565601CC3A395AFFVSERVER1can_--