Return-Path: Delivered-To: apmail-db-derby-user-archive@www.apache.org Received: (qmail 82603 invoked from network); 23 Apr 2007 11:26:06 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 23 Apr 2007 11:26:06 -0000 Received: (qmail 6185 invoked by uid 500); 23 Apr 2007 11:26:10 -0000 Delivered-To: apmail-db-derby-user-archive@db.apache.org Received: (qmail 6137 invoked by uid 500); 23 Apr 2007 11:26:10 -0000 Mailing-List: contact derby-user-help@db.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: List-Id: Reply-To: "Derby Discussion" Delivered-To: mailing list derby-user@db.apache.org Received: (qmail 6092 invoked by uid 99); 23 Apr 2007 11:26:10 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 23 Apr 2007 04:26:10 -0700 X-ASF-Spam-Status: No, hits=0.6 required=10.0 tests=NO_REAL_NAME,UNPARSEABLE_RELAY X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: local policy) Received: from [192.18.1.36] (HELO gmp-ea-fw-1.sun.com) (192.18.1.36) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 23 Apr 2007 04:26:02 -0700 Received: from d1-emea-09.sun.com (d1-emea-09.sun.com [192.18.2.119]) by gmp-ea-fw-1.sun.com (8.13.6+Sun/8.12.9) with ESMTP id l3NBPees020982 for ; Mon, 23 Apr 2007 11:25:41 GMT Received: from conversion-daemon.d1-emea-09.sun.com by d1-emea-09.sun.com (Sun Java System Messaging Server 6.2-6.01 (built Apr 3 2006)) id <0JGY008017LG8900@d1-emea-09.sun.com> (original mail from Dyre.Tjeldvoll@Sun.COM) for derby-user@db.apache.org; Mon, 23 Apr 2007 12:25:40 +0100 (BST) Received: from khepri29.sun.com ([129.159.112.241]) by d1-emea-09.sun.com (Sun Java System Messaging Server 6.2-6.01 (built Apr 3 2006)) with ESMTPSA id <0JGY00AEH7QR9T00@d1-emea-09.sun.com> for derby-user@db.apache.org; Mon, 23 Apr 2007 12:25:40 +0100 (BST) Date: Mon, 23 Apr 2007 13:25:37 +0200 From: Dyre.Tjeldvoll@Sun.COM Subject: Re: Derby process taking up more than 20% of cpu for more than 5hrs with only one user connected. In-reply-to: Sender: Dyre.Tjeldvoll@Sun.COM To: Derby Discussion Message-id: MIME-version: 1.0 Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7BIT References: User-Agent: Gnus/5.110004 (No Gnus v0.4) Emacs/22.0.50 (usg-unix-v) X-Virus-Checked: Checked by ClamAV on apache.org Sachin Chitale writes: > Hi, > > To give you a background of what I was testing: > > We have process that uses database as our guranteed message store. To give a > stress test to Derby, this program was run to store about 70k+ messages. Then > a subscriber was started to fetch the messages. But it seems that the database > has stopped responding to the requests. > > Initially I was able to get message count using: > ij> select count(*) from M_20070129MESSAGES; > 1 > ----------- > 54598 > > After ~15 minutes. > 1 row selected > ij> select count(*) from M_20070129MESSAGES; > ERROR 40XL1: A lock could not be obtained within the time requested > ij> select count(*) from M_20070129MESSAGES; > ERROR 40XL1: A lock could not be obtained within the time requested > > ij> select count(*) from M_20070129MESSAGES; > ERROR 40XL1: A lock could not be obtained within the time requested > ij> select * from M_20070129MESSAGES where MESSAGEID='000000000000000000001746' > ; > MESSAGEID |MES& > ----------------------------- > ERROR 40XL1: A lock could not be obtained within the time requested > > When I did prstat on the running derby process it gave the following result: > ------------------------------------------- > PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID > 27823 schitale 76 1.0 - - - - 21 - 73 127 767 0 java/28 > 27823 schitale 0.5 0.0 - - - - 62 - 2 38 40 0 java/2 > 27823 schitale 0.5 0.0 - - - - 62 - 2 36 38 0 java/4 > 27823 schitale 0.5 0.0 - - - - 62 - 2 37 39 0 java/3 > 27823 schitale 0.5 0.0 - - - - 62 - 2 36 38 0 java/5 > 27823 schitale 0.2 0.0 - - - - 62 - 0 0 0 0 java/11 > 27823 schitale 0.2 0.0 - - - - 62 - 0 0 0 0 java/12 > 27823 schitale 0.1 0.0 - - - - 62 - 1 0 5 0 java/6 > 27823 schitale 0.1 0.0 - - - - 62 - 1 0 8 0 java/27 > 27823 schitale 0.0 0.0 - - - - 63 - 4 0 12 0 java/8 > 27823 schitale 0.0 0.0 - - - - 63 - 5 0 12 0 java/7 > 27823 schitale 0.0 0.0 - - - - 63 - 0 0 0 0 java/1 > 27823 schitale 0.0 0.0 - - - - 63 - 0 0 0 0 java/10 > 27823 schitale 0.0 0.0 - - - - 63 - 0 0 0 0 java/15 > 27823 schitale 0.0 0.0 - - - - 100 - 0 0 0 0 java/14 > NLWP USERNAME SIZE RSS MEMORY TIME CPU > 20 schitale 6610M 5433M 34% 1:59.03 23% > Total: 1 processes, 20 lwps, load averages: 1.12, 1.12, 1.21 > > Thread no 28 was taking us most of the CPU time, and 23% was the total CPU > cycle share was taken up by derby process. > > To find out which java thread was taking time I did a jstack of the process > which revealed the following: > > jstack 27823 > -------------------------------------------------------------------------- > Thread t@28: (state = IN_JAVA) > - java.util.Hashtable.get(java.lang.Object) @bci=45, line=339 (Compiled > frame; information may be imprecise) > - org.apache.derby.impl.store.raw.data.FileContainer.getUserPage > (org.apache.derby.impl.store.raw.data.BaseContainerHandle, long, boolean, > boolean) @bci=48 ( > Compiled frame) > - org.apache.derby.impl.store.access.btree.BTreeScan.reposition > (org.apache.derby.impl.store.access.btree.BTreeRowPosition, boolean) @bci=114 > (Compiled frame > ) > - org.apache.derby.impl.store.access.btree.BTreeForwardScan.fetchRows > (org.apache.derby.impl.store.access.btree.BTreeRowPosition, > org.apache.derby.iapi.types > .DataValueDescriptor[][], org.apache.derby.iapi.types.RowLocation[], > org.apache.derby.iapi.store.access.BackingStoreHashtable, long, int[]) @bci=31 > (Compiled > frame) > - org.apache.derby.impl.store.access.btree.BTreeScan.fetchNext > (org.apache.derby.iapi.types.DataValueDescriptor[]) @bci=29 (Compiled frame) > - org.apache.derby.iapi.store.access.DiskHashtable.get(java.lang.Object) > @bci=4 (Interpreted frame) > - org.apache.derby.iapi.store.access.BackingStoreHashtable.get > (java.lang.Object) @bci=27 (Interpreted frame) > - org.apache.derby.impl.sql.execute.HashScanResultSet.getNextRowCore() > @bci=64 (Interpreted frame) > - > org.apache.derby.impl.sql.execute.NestedLoopLeftOuterJoinResultSet.getNextRowCo I'm guessing that you are executing a join that is so large that it spills to disk. This could take a long time and so hold locks for a long time. In addition to trying tableLocks (as was already suggested), you could try to increase the heapSize for your jvm (the amount of memory used in joins before spilling to disk is a percentage of the heap size, I think). You could also try to increase the pageCache size. -- dt