commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Steitz <>
Subject Re: [DBCP] Connection just obtained from datasource is invalid
Date Tue, 09 Jan 2018 21:26:35 GMT
On 1/9/18 1:56 PM, Phil Steitz wrote:
> On 1/9/18 11:50 AM, Phil Steitz wrote:
>> On 1/8/18 4:23 PM, Shawn Heisey wrote:
>>> On 11/22/2017 5:00 PM, Phil Steitz wrote:
>>>> If the problem is the evictor closing a connection and having that
>>>> connection delivered to a client, the problem is almost certainly in
>>>> pool.  The thread-safety of the pool in this regard is engineered in
>>>> DefaultPooledObject, which is the wrapper that pool manages and
>>>> delivers to DBCP.  When the evictor visits a PooledObject (in
>>>> GenericObjectPool#evict) it tries to start the eviction test on the
>>>> object by calling its startEvictionTest method.  This method is
>>>> synchronized on the DefaultPooledObject.  Look at the code in that
>>>> method.  It checks to make sure that the object is in fact idle in
>>>> the pool.  The other half of the protection here is in
>>>> GenericObjectPool#borrowObject, which is what PoolingDataSource
>>>> calls to get a connection.  That method tries to get a PooledObject
>>>> from the pool and before handing it out (or validating it), it calls
>>>> the PooledObject's allocate method.  Look at the code for that in
>>>> DefaultPooledObject.  That method (also synchronized on the
>>>> PooledObject) checks that the object is not under eviction and sets
>>>> its state to allocated.  That is the core sync protection that
>>>> *should* make it impossible for the evictor to do anything to an
>>>> object that has been handed out to a client.
>>> I see the synchronization you're talking about here.  It appears that
>>> all of the critical methods in DefaultPooledObject are synchronized (on
>>> the object).
>>> If you're absolutely certain that DefaultPooledObject is involved with
>>> all of the implementation my code is using, it all looks pretty complete
>>> to me. 
>> Yes, the code you posted at the top of the thread uses a
>> PoolableConnectionFactory as the object factory for the pool.  You
>> can see that PCF's makeObject returns a DefaultPooledObject, so that
>> much is certain.
>>>  So I'm really curious as to why the connection is getting
>>> closed.  I have seen the problem only minutes after restarting my
>>> program, so it seems unlikely that the server side is closing the
>>> connection, since the timeout for that is 8 hours.
>> I looked back at the initial stack trace and I noticed something
>> that I had not noticed before.
>> This line
>> org.apache.commons.dbcp2.DelegatingConnection.createStatement(
>> means that checkOpen() succeeded.  That, combined with your
>> statement above that isClosed() returns true on a failed connection
>> means that there might be concurrent access to the
>> DelegatingConnections happening.  It looks like the sequence might
>> have been:
>> thread 1: checkOpen - sees true
>> thread 2: close the DelegatingConnection  (there is no sync to
>> prevent this)
>> thread1 : createStatement - bang!
>> thread1 : isClosed() returns true
>> DBCP is not really safe to use that way - i.e., really the intended
>> setup is that individual connection handles are not concurrently
>> accessed by multiple threads.  Is it possible something like this is
>> going on?  Note that what I am talking about here is two different
>> threads holding references to the same connection handle - i.e., no
>> trips back through the pool.
> I just noticed another thing in [pool] that might have something to
> do with this.  It's probably best to investigate what I have in mind
> on the dev list.  I will post a summary / ticket reference here if
> it turns out I this is a bug.

Sorry for the noise.  Bug idea evaporated when I dug into it.


> Phil
>> Phil
>>> I did add the code a while back to test on create, borrow, return, and
>>> while idle, but it turns out that I hadn't actually pulled it down to
>>> the test server and recompiled.  That is now done, so we'll see if that
>>> makes any difference.
>>> If testing the connection on pool actions does make a difference, then
>>> what is your speculation about what was happening when I ran into the
>>> closed connection only minutes after restart, and would it be worthy of
>>> an issue in Jira?  The only theory I had was a race condition between
>>> eviction and borrowing, but unless there's something amiss in how all
>>> the object inheritance works out, it looks like that's probably not it. 
>>> Some kind of issue with the TCP stack in Linux (either on the machines
>>> running my code or the MySQL server) is the only other idea I can think
>>> of.  Or maybe a hardware/firmware issue, since it's likely that at least
>>> one of the NICs involved is doing TCP offload.  I think that virtually
>>> every NIC in our infrastructure has that feature and that Linux enables it.
>>> Thanks,
>>> Shawn
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail:
>>> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message