From user-return-31117-archive-asf-public=cust-asf.ponee.io@commons.apache.org Fri Mar 2 06:51:25 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 9CC8F18062F for ; Fri, 2 Mar 2018 06:51:24 +0100 (CET) Received: (qmail 71600 invoked by uid 500); 2 Mar 2018 05:51:22 -0000 Mailing-List: contact user-help@commons.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Commons Users List" Delivered-To: mailing list user@commons.apache.org Received: (qmail 71575 invoked by uid 99); 2 Mar 2018 05:51:21 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Mar 2018 05:51:21 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 63F6C1A0241 for ; Fri, 2 Mar 2018 05:51:21 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.112 X-Spam-Level: X-Spam-Status: No, score=-0.112 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=elyograg.org Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id lefOxo7qZOw4 for ; Fri, 2 Mar 2018 05:51:19 +0000 (UTC) Received: from frodo.elyograg.org (frodo.elyograg.org [166.70.79.219]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 8EBDA5F181 for ; Fri, 2 Mar 2018 05:51:18 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by frodo.elyograg.org (Postfix) with ESMTP id 7885865E4 for ; Thu, 1 Mar 2018 22:50:36 -0700 (MST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=elyograg.org; h= content-language:content-transfer-encoding:content-type :content-type:in-reply-to:mime-version:user-agent:date:date :message-id:from:from:references:subject:subject:received :received; s=mail; t=1519969836; bh=jobHUxW+BLFT9OPn4+azPoXlOVER wksAmjFr91UegrI=; b=ZqSVy3JQL+ZeRINo7F5wy5oCfBYgz7pUR4eIuf5WRcfc 2hHCfbswxjy65fViw1yu3p6Zp746nj85RLejEsDr4afRCx0o1dnBt/TBmFApJgvy ePOdM+Vb4t+u/spCecKWARqmIVmVZTzGXn9QIkl3LcbOvF/CRA8zkFyM30tig24= X-Virus-Scanned: Debian amavisd-new at frodo.elyograg.org Received: from frodo.elyograg.org ([127.0.0.1]) by localhost (frodo.elyograg.org [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id OP+LtQGtA6KK for ; Thu, 1 Mar 2018 22:50:36 -0700 (MST) Received: from [192.168.1.127] (127.int.elyograg.org [192.168.1.127]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: elyograg@elyograg.org) by frodo.elyograg.org (Postfix) with ESMTPSA id 0A7C465CA for ; Thu, 1 Mar 2018 22:50:36 -0700 (MST) Subject: Re: [DBCP] Connection pool not behaving as I expect To: user@commons.apache.org References: <7258bf38-8d72-a00d-ef8c-4f301481e367@elyograg.org> From: Shawn Heisey Message-ID: Date: Thu, 1 Mar 2018 22:50:38 -0700 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US On 3/1/2018 8:48 PM, Matt Sicker wrote: > Take a look inside commons-pool for the instrumentation (e.g., JMX). You > can also track usage on borrow and other leaks. > > Also, Tomcat uses DBCP as it is. I know I can get those numbers from the object pool, but at the point where I needed them, I don't actually know *which* DataSource is being used.  It could be one of two.  So I don't know which object pool to look at. Interim solution before I switch to BasicDataSource:  Log info from BOTH pools.  I did this, and here's some info that gets logged on the dev server where I installed the change: WARN  - 2018-03-01 21:59:00.313;   371; c; CP: main-active=0, main-idle=1, master-active=0, master-idle=1 WARN  - 2018-03-01 21:59:00.314;   371; c; CP: main-active=0, main-idle=1, master-active=0, master-idle=1 WARN  - 2018-03-01 21:59:00.315;   371; c; CP: main-active=0, main-idle=1, master-active=0, master-idle=1 WARN  - 2018-03-01 21:59:00.316;   371; c; CP: main-active=0, main-idle=1, master-active=0, master-idle=1 WARN  - 2018-03-01 21:59:00.318;   371; c; CP: main-active=0, main-idle=1, master-active=0, master-idle=1 WARN  - 2018-03-01 21:59:00.319;   371; c; CP: main-active=0, main-idle=1, master-active=0, master-idle=1 WARN  - 2018-03-01 21:59:00.320;   371; c; CP: main-active=0, main-idle=1, master-active=0, master-idle=1 WARN  - 2018-03-01 21:59:00.321;   371; c; CP: main-active=0, main-idle=1, master-active=0, master-idle=1 WARN  - 2018-03-01 21:59:00.322;   371; c; CP: main-active=0, main-idle=1, master-active=0, master-idle=1 WARN  - 2018-03-01 21:59:00.323;   371; c; CP: main-active=0, main-idle=1, master-active=0, master-idle=1 WARN  - 2018-03-01 21:59:00.324;   371; c; CP: main-active=0, main-idle=1, master-active=0, master-idle=1 WARN  - 2018-03-01 21:59:00.325;   371; c; CP: main-active=0, main-idle=1, master-active=0, master-idle=1 This logging happens just before a connection is obtained from the pool.  The info logged is the same -- zero active, one or more idle.  Except during program startup, when there are some lines where both are zero.  And once an hour, there is a background thread that does a query that takes nearly a minute to run, and while that's happening, I do see main-active=1. I did just now think of one possibility that might explain why this program misbehaves when the problem happens, even if DBCP is working right.  Based on the logging I've enabled, and the fact that nobody has said "oh, we see that all the time, and the problem is probably X" ... I think DBCP probably is working right. ==== Basic info about my program: All reads are done via the "main" pool, which connects to a slave, unless connections there are not working, then reads switch to the "master" pool for the next 1000 connections.  All writes are done via the "master" pool. When the master server reaches its connection limit, a completely separate process that adds information to the monster table in the DB cannot work.  It's not Java-based, and doesn't have connection pooling available. When there are no new docs in the monster table, my program doesn't have anything to do, so it doesn't need to make any changes to its control table -- it's not going to be using the master pool.  That probably results in the idle connection to the master server being evicted five minutes after the additions to the database stop.  Then because there's now a connection available on the server, the other system can suddenly add some documents.  So my program notices the new docs via its main pool, processes them, and then it would need to update its control table on the master server.  There's no idle connection because it got evicted, so it tries to make a new one, and we get an explosion. ==== I still do have the separate problem of why our app servers explode with "Too many connections" exceptions, when the DB pools there have dozens of active connections.  I didn't write that code.  Maybe their DB access (which is primarily through hibernate, a library I don't know much about) is not properly releasing connections back to the pool, so the pool does not think they're actually idle.  I will need to see what logging they can add. Thanks, Shawn --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscribe@commons.apache.org For additional commands, e-mail: user-help@commons.apache.org