From user-return-31117-archive-asf-public=cust-asf.ponee.io@commons.apache.org  Fri Mar  2 06:51:25 2018
Return-Path: <user-return-31117-archive-asf-public=cust-asf.ponee.io@commons.apache.org>
X-Original-To: archive-asf-public@cust-asf.ponee.io
Delivered-To: archive-asf-public@cust-asf.ponee.io
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
	by mx-eu-01.ponee.io (Postfix) with SMTP id 9CC8F18062F
	for <archive-asf-public@cust-asf.ponee.io>; Fri,  2 Mar 2018 06:51:24 +0100 (CET)
Received: (qmail 71600 invoked by uid 500); 2 Mar 2018 05:51:22 -0000
Mailing-List: contact user-help@commons.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:user-help@commons.apache.org>
List-Unsubscribe: <mailto:user-unsubscribe@commons.apache.org>
List-Post: <mailto:user@commons.apache.org>
List-Id: <user.commons.apache.org>
Reply-To: "Commons Users List" <user@commons.apache.org>
Delivered-To: mailing list user@commons.apache.org
Received: (qmail 71575 invoked by uid 99); 2 Mar 2018 05:51:21 -0000
Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142)
    by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Mar 2018 05:51:21 +0000
Received: from localhost (localhost [127.0.0.1])
	by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 63F6C1A0241
	for <user@commons.apache.org>; Fri,  2 Mar 2018 05:51:21 +0000 (UTC)
X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org
X-Spam-Flag: NO
X-Spam-Score: -0.112
X-Spam-Level:
X-Spam-Status: No, score=-0.112 tagged_above=-999 required=6.31
	tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1,
	SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01]
	autolearn=disabled
Authentication-Results: spamd2-us-west.apache.org (amavisd-new);
	dkim=pass (1024-bit key) header.d=elyograg.org
Received: from mx1-lw-eu.apache.org ([10.40.0.8])
	by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024)
	with ESMTP id lefOxo7qZOw4 for <user@commons.apache.org>;
	Fri,  2 Mar 2018 05:51:19 +0000 (UTC)
Received: from frodo.elyograg.org (frodo.elyograg.org [166.70.79.219])
	by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 8EBDA5F181
	for <user@commons.apache.org>; Fri,  2 Mar 2018 05:51:18 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
	by frodo.elyograg.org (Postfix) with ESMTP id 7885865E4
	for <user@commons.apache.org>; Thu,  1 Mar 2018 22:50:36 -0700 (MST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=elyograg.org; h=
	content-language:content-transfer-encoding:content-type
	:content-type:in-reply-to:mime-version:user-agent:date:date
	:message-id:from:from:references:subject:subject:received
	:received; s=mail; t=1519969836; bh=jobHUxW+BLFT9OPn4+azPoXlOVER
	wksAmjFr91UegrI=; b=ZqSVy3JQL+ZeRINo7F5wy5oCfBYgz7pUR4eIuf5WRcfc
	2hHCfbswxjy65fViw1yu3p6Zp746nj85RLejEsDr4afRCx0o1dnBt/TBmFApJgvy
	ePOdM+Vb4t+u/spCecKWARqmIVmVZTzGXn9QIkl3LcbOvF/CRA8zkFyM30tig24=
X-Virus-Scanned: Debian amavisd-new at frodo.elyograg.org
Received: from frodo.elyograg.org ([127.0.0.1])
	by localhost (frodo.elyograg.org [127.0.0.1]) (amavisd-new, port 10026)
	with ESMTP id OP+LtQGtA6KK for <user@commons.apache.org>;
	Thu,  1 Mar 2018 22:50:36 -0700 (MST)
Received: from [192.168.1.127] (127.int.elyograg.org [192.168.1.127])
	(using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits))
	(No client certificate requested)
	(Authenticated sender: elyograg@elyograg.org)
	by frodo.elyograg.org (Postfix) with ESMTPSA id 0A7C465CA
	for <user@commons.apache.org>; Thu,  1 Mar 2018 22:50:36 -0700 (MST)
Subject: Re: [DBCP] Connection pool not behaving as I expect
To: user@commons.apache.org
References: <7258bf38-8d72-a00d-ef8c-4f301481e367@elyograg.org>
 <CACZkXPyiemxWV7mbYyTZ00iDs6ResC0nGW2SuZRxYztZS2tcrA@mail.gmail.com>
 <d2136a98-556c-8155-da82-e27021efc6bb@elyograg.org>
 <CACmp6kpU6zukD3Hwm8O4JnAXedkiOfvouQGaGYD5bbb_edHpbg@mail.gmail.com>
From: Shawn Heisey <apache@elyograg.org>
Message-ID: <c1fc32fa-68b2-db8c-0304-fe060dd00016@elyograg.org>
Date: Thu, 1 Mar 2018 22:50:38 -0700
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101
 Thunderbird/52.6.0
MIME-Version: 1.0
In-Reply-To: <CACmp6kpU6zukD3Hwm8O4JnAXedkiOfvouQGaGYD5bbb_edHpbg@mail.gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Content-Language: en-US

On 3/1/2018 8:48 PM, Matt Sicker wrote:
> Take a look inside commons-pool for the instrumentation (e.g., JMX). You
> can also track usage on borrow and other leaks.
>
> Also, Tomcat uses DBCP as it is.

I know I can get those numbers from the object pool, but at the point 
where I needed them, I don't actually know *which* DataSource is being 
used.  It could be one of two.  So I don't know which object pool to 
look at.

Interim solution before I switch to BasicDataSource:  Log info from BOTH 
pools.  I did this, and here's some info that gets logged on the dev 
server where I installed the change:

WARN  - 2018-03-01 21:59:00.313;   371; c; CP: main-active=0, 
main-idle=1, master-active=0, master-idle=1
WARN  - 2018-03-01 21:59:00.314;   371; c; CP: main-active=0, 
main-idle=1, master-active=0, master-idle=1
WARN  - 2018-03-01 21:59:00.315;   371; c; CP: main-active=0, 
main-idle=1, master-active=0, master-idle=1
WARN  - 2018-03-01 21:59:00.316;   371; c; CP: main-active=0, 
main-idle=1, master-active=0, master-idle=1
WARN  - 2018-03-01 21:59:00.318;   371; c; CP: main-active=0, 
main-idle=1, master-active=0, master-idle=1
WARN  - 2018-03-01 21:59:00.319;   371; c; CP: main-active=0, 
main-idle=1, master-active=0, master-idle=1
WARN  - 2018-03-01 21:59:00.320;   371; c; CP: main-active=0, 
main-idle=1, master-active=0, master-idle=1
WARN  - 2018-03-01 21:59:00.321;   371; c; CP: main-active=0, 
main-idle=1, master-active=0, master-idle=1
WARN  - 2018-03-01 21:59:00.322;   371; c; CP: main-active=0, 
main-idle=1, master-active=0, master-idle=1
WARN  - 2018-03-01 21:59:00.323;   371; c; CP: main-active=0, 
main-idle=1, master-active=0, master-idle=1
WARN  - 2018-03-01 21:59:00.324;   371; c; CP: main-active=0, 
main-idle=1, master-active=0, master-idle=1
WARN  - 2018-03-01 21:59:00.325;   371; c; CP: main-active=0, 
main-idle=1, master-active=0, master-idle=1

This logging happens just before a connection is obtained from the 
pool.  The info logged is the same -- zero active, one or more idle.  
Except during program startup, when there are some lines where both are 
zero.  And once an hour, there is a background thread that does a query 
that takes nearly a minute to run, and while that's happening, I do see 
main-active=1.

I did just now think of one possibility that might explain why this 
program misbehaves when the problem happens, even if DBCP is working 
right.  Based on the logging I've enabled, and the fact that nobody has 
said "oh, we see that all the time, and the problem is probably X" ... I 
think DBCP probably is working right.

====
Basic info about my program: All reads are done via the "main" pool, 
which connects to a slave, unless connections there are not working, 
then reads switch to the "master" pool for the next 1000 connections.  
All writes are done via the "master" pool.

When the master server reaches its connection limit, a completely 
separate process that adds information to the monster table in the DB 
cannot work.  It's not Java-based, and doesn't have connection pooling 
available.

When there are no new docs in the monster table, my program doesn't have 
anything to do, so it doesn't need to make any changes to its control 
table -- it's not going to be using the master pool.  That probably 
results in the idle connection to the master server being evicted five 
minutes after the additions to the database stop.  Then because there's 
now a connection available on the server, the other system can suddenly 
add some documents.  So my program notices the new docs via its main 
pool, processes them, and then it would need to update its control table 
on the master server.  There's no idle connection because it got 
evicted, so it tries to make a new one, and we get an explosion.
====

I still do have the separate problem of why our app servers explode with 
"Too many connections" exceptions, when the DB pools there have dozens 
of active connections.  I didn't write that code.  Maybe their DB access 
(which is primarily through hibernate, a library I don't know much 
about) is not properly releasing connections back to the pool, so the 
pool does not think they're actually idle.  I will need to see what 
logging they can add.

Thanks,
Shawn


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org