commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Steitz <phil.ste...@gmail.com>
Subject Re: [DBCP] Connection just obtained from datasource is invalid
Date Sat, 18 Nov 2017 23:01:53 GMT
On 11/17/17 3:27 PM, Phil Steitz wrote:
> On 11/17/17 12:22 PM, Shawn Heisey wrote:
>> On 11/16/2017 5:21 PM, Phil Steitz wrote:
>>>> How can a connection that I *just* retrieved from the pool be closed?
>>> Should not happen.  In most cases like this, it turns out not to be
>>> the pool that closed the connection.  The most common causes of this
>>> are network connectivity problems or the server closing the
>>> connection due to idle timeout.  The latter seems unlikely given
>>> that you have idle eviction configured.  Is it possible that some
>>> clients are holding connections for a very long time without using
>>> them?  The evictor only acts on connections that are idle in the
>>> pool (i.e. it does not do anything to connections that are checked
>>> out to clients).
>> Thank you for your reply.
>>
>> My program does its work once a minute, and I had also configured the
>> idle eviction to run once a minute.
>>
>> Just to see what it would do, I slightly changed the run interval for
>> the idle eviction -- for one of the object pools, adding 943
>> milliseconds, and 944 milliseconds for the other.  I've only seen the
>> problem happen once since making that change, where before it would
>> happen several times per day.  This isn't actual evidence, but it is
>> curious.
> See comment below on the config.
>> Although a network problem is always a possibility, it's not likely. 
>> It's a gigabit LAN with Cisco switches.  I will trace the network path
>> and check all the switch logs to see whether that might indeed be the
>> problem.
>>
>>> That should not be possible, as once a connection has been checked
>>> out, it is not eligible for eviction.  And even if it were evicted
>>> by the pool, the pool would set the DelegatingConnection's closed
>>> property to true so isValid would return false rather than
>>> throwing.  It looks like the situation here is that the pool thinks
>>> the connection is open, but the underlying physical connection has
>>> been closed.
>> The code is within a try/catch block that traps all exceptions, so I am
>> 100 percent certain that isValid is not throwing an exception, it's just
>> returning false. 
> OK, sorry I misread your initial post.  That means either the pool
> agrees that the connection is closed (which would point to a pool or
> DBCP bug) or the driver's isValid() on the underlying physical
> connection is returning true (without throwing).
Sorry, I meant "false" above.  DelegatingConnection's isValid is just

public boolean isValid(int timeout) throws SQLException {
        if (isClosed()) {
            return false;
        }
        try {
            return _conn.isValid(timeout);
        }
        catch (SQLException e) {
            handleException(e);
            return false;
        }
    }

Note that the validation query is not executed here.  That query is
only used by the pool lifecycle methods when you have testOnBorrow,
testOnReturn or testWhileIdle set to true.  handleExeption as
overridden by PoolableConnection rethrows.  _conn is the underlying
physical connection,

Phil

>   If it's not too
> hard to do, it would be good to also log what comes back from
> isClosed() on the DelegatingConnection (what dsMaster returns).  If
> that returns true, then either this is a DBCP or pool bug or you
> have somehow closed the DelegatingConnection via another reference
> or something.  If it returns false, that means the pool thinks the
> connection is open so it is unlikely that the pool closed it.
>>  When I first started investigating my strange alarms,
>> I couldn't tell what was wrong at all until I decided to add some code
>> to try a query on the connection that said it wasn't valid.   I fully
>> expected that query to fail, I was just doing it to try and discover why
>> isValid returned false.
>>
>> This is a very recent problem, but the code hasn't had any changes since
>> well before the problem began.  That does lend credibility to the
>> possibility of a network problem, which as I said, I will investigate.
>>
>>> It would be good to see your full pool config and a stack trace of
>>> the exception above.  You might also look at the server logs to see
>>> what is happening on the Mysql side.
>> Tell me exactly what I need to do in order to gather the pool config
>> you're after.  Are you looking for Java code, or some kind of
>> information from the objects that I need to gather?  Here's the code
>> that defines a datasource. 
> What you have below is fine.  I just wanted to see the pool
> configuration settings.  Looks OK, but might end up creating a lot
> of connection churn having maxIdle set to 6 with maxTotal at 30.  I
> guess the idea is that when it runs, you want to have an idle
> connection available for each shard with the ability to add more up
> to a total of 30, but you don't want those idle connections to stay
> in the pool when the work isn't being done.  Do you know how many
> connections actually get opened during the work period?  If that is
> more than a few more than 6, you are kind of defeating the purpose
> of the pool by closing them all when they get returned (as the
> maxIdle setting will force).
>
> The fact that changing the eviction interval to not exactly coincide
> with when the work happens makes the problem go away is troubling. 
> I can see how that might improve performance, but it does make the
> pool bug scenario more likely.
>
> One more question on the config.  Why are you running the evictor
> every minute and using eviction to close idle connections?  The
> maxIdle setting will keep the connection count down.  If what you
> are worried about is server-side idle timeouts, you could use set
> testWhileIdle to true and that would use the connections rather than
> closing them.
>
> Phil
>>  This is the section for the master server, I
>> have a virtually identical code section for the "main" server, which is
>> a replication slave on the MySQL side.
>>
>>   /*
>>    * Create a datasource (connection pool) for the master database server.
>>    */
>>   ConnectionFactory cfMaster = new
>> DriverManagerConnectionFactory(masterUrl, dbUser, dbPass);
>>   PoolableConnectionFactory pcfMaster = new
>> PoolableConnectionFactory(cfMaster, null);
>>   pcfMaster.setValidationQuery(validationQuery);
>>   pcfMaster.setValidationQueryTimeout(Const.FIVE_SECONDS / 1000);
>>   opMaster = new GenericObjectPool<>(pcfMaster);
>>   opMaster.setMaxWaitMillis(Const.THIRTY_SECONDS);
>>   opMaster.setMaxIdle(numShards);
>>   opMaster.setMaxTotal(numShards * 5);
>>   opMaster.setNumTestsPerEvictionRun(numShards * 5);
>>   opMaster.setTimeBetweenEvictionRunsMillis(Const.ONE_MINUTE + 943);
>>   opMaster.setMinEvictableIdleTimeMillis(Const.ONE_MINUTE * 5);
>>   pcfMaster.setPool(opMaster);
>>   dsMaster = new PoolingDataSource<>(opMaster);
>>
>> The numShards value is 6 on the server where I am doing the investigation.
>>
>> The error log on all my MySQL servers isn't saying anything, but we
>> haven't changed the logging config, so I don't know if that's expected.
>>
>> This is the stacktrace when I try a query on a connection that has just
>> failed the isValid check:
>>
>> WARN  - 2017-11-16 13:10:20.229;   418; a; Problem with master DB server
>> noted.  Additional log may follow.
>> com.REDACTED.idxbuild.util.BuildException: Arbitrary DB query failed!
>>         at
>> com.REDACTED.idxbuild.db.Database.runArbitraryQueryOnConnection(Database.java:743)
>>         at
>> com.REDACTED.idxbuild.db.Database.getConnection(Database.java:408)
>>         at com.REDACTED.idxbuild.config.ConfigItem.load(ConfigItem.java:160)
>>         at com.REDACTED.idxbuild.config.ConfigDb.load(ConfigDb.java:251)
>>         at com.REDACTED.idxbuild.index.Chain$Flags.load(Chain.java:320)
>>         at com.REDACTED.idxbuild.index.Chain.run(Chain.java:2561)
>> Caused by:
>> com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: No
>> operations allowed after connection closed.
>>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>> Method)
>>         at
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>>         at
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>>         at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>>         at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
>>         at com.mysql.jdbc.Util.getInstance(Util.java:408)
>>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:918)
>>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:897)
>>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:886)
>>         at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:860)
>>         at
>> com.mysql.jdbc.ConnectionImpl.throwConnectionClosedException(ConnectionImpl.java:1187)
>>         at
>> com.mysql.jdbc.ConnectionImpl.checkClosed(ConnectionImpl.java:1182)
>>         at
>> com.mysql.jdbc.ConnectionImpl.createStatement(ConnectionImpl.java:2374)
>>         at
>> com.mysql.jdbc.ConnectionImpl.createStatement(ConnectionImpl.java:2358)
>>         at
>> org.apache.commons.dbcp2.DelegatingConnection.createStatement(DelegatingConnection.java:262)
>>         at
>> org.apache.commons.dbcp2.DelegatingConnection.createStatement(DelegatingConnection.java:262)
>>         at
>> com.REDACTED.idxbuild.db.Database.runArbitraryQueryOnConnection(Database.java:736)
>>         ... 5 more
>>
>> While reviewing the specifics of that stacktrace, I discovered a tiny
>> bug in my code, where problems talking to the main DB server were being
>> incorrectly classified as problems with the master server.  I think I've
>> fixed that problem, but I will have to wait for it to happen again
>> before I can be sure.
>>
>> Thanks,
>> Shawn
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
>> For additional commands, e-mail: user-help@commons.apache.org
>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Mime
View raw message