Return-Path: Delivered-To: apmail-activemq-dev-archive@www.apache.org Received: (qmail 47579 invoked from network); 13 Sep 2010 20:07:03 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 13 Sep 2010 20:07:03 -0000 Received: (qmail 23631 invoked by uid 500); 13 Sep 2010 20:07:03 -0000 Delivered-To: apmail-activemq-dev-archive@activemq.apache.org Received: (qmail 23472 invoked by uid 500); 13 Sep 2010 20:07:02 -0000 Mailing-List: contact dev-help@activemq.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@activemq.apache.org Delivered-To: mailing list dev@activemq.apache.org Received: (qmail 23464 invoked by uid 99); 13 Sep 2010 20:07:02 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Sep 2010 20:07:02 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Sep 2010 20:07:01 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o8DK6fAM003872 for ; Mon, 13 Sep 2010 20:06:41 GMT Message-ID: <25375332.15301284408401846.JavaMail.jira@thor> Date: Mon, 13 Sep 2010 16:06:41 -0400 (EDT) From: "Madhu Krishna (JIRA)" To: dev@activemq.apache.org Subject: [jira] Updated: (AMQ-2916) KahaDb not closing opend file after lock check in secondery server: shared file system mode In-Reply-To: <10660688.15271284407680360.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: ae95407df07c98740808b2ef9da0087c [ https://issues.apache.org/activemq/browse/AMQ-2916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Madhu Krishna updated AMQ-2916: ------------------------------- Affects Version/s: 5.3.0 (was: 5.3.2) > KahaDb not closing opend file after lock check in secondery server: shared file system mode > ------------------------------------------------------------------------------------------- > > Key: AMQ-2916 > URL: https://issues.apache.org/activemq/browse/AMQ-2916 > Project: ActiveMQ > Issue Type: Bug > Components: Broker > Affects Versions: 5.3.0 > Environment: Mac os x 10.6 which NFS file system for shared file system > Reporter: Madhu Krishna > Priority: Critical > > This For KahaDb store. > From the logs it looks like activeMq shared file system broker has a leak in RandomAccessfile lock check. On a lock fail it is not closing the opened files. This is causing the secondary server to run out of number of open files set in the OS. > Current OS is Set to 1024 max open files. > I have been running some tests to validate this and the following is the observations. > 1. Log from secondary server: > 2010-09-10 06:36:12,157 [MQ-BROKER-SERVICE] INFO (KahaStore.java:463) - Kaha Store using data directory /nfs/soleta/gdvt/activeMq > 2010-09-10 06:36:12,157 [MQ-BROKER-SERVICE] INFO (KahaPersistenceAdapter.java:180) - Store is locked... waiting 10 seconds for the Store to be unlocked. > 2010-09-10 06:36:22,159 [MQ-BROKER-SERVICE] INFO (KahaStore.java:463) - Kaha Store using data directory /nfs/soleta/gdvt/activeMq > 2010-09-10 06:36:22,161 [MQ-BROKER-SERVICE] INFO (KahaPersistenceAdapter.java:180) - Store is locked... waiting 10 seconds for the Store to be unlocked. > 2010-09-10 06:36:32,161 [MQ-BROKER-SERVICE] INFO (KahaStore.java:463) - Kaha Store using data directory /nfs/soleta/gdvt/activeMq > 2010-09-10 06:36:32,162 [MQ-BROKER-SERVICE] INFO (KahaPersistenceAdapter.java:180) - Store is locked... waiting 10 seconds for the Store to be unlocked. > 2010-09-10 06:36:42,162 [MQ-BROKER-SERVICE] INFO (KahaStore.java:463) - Kaha Store using data directory /nfs/soleta/gdvt/activeMq > 2010-09-10 06:36:42,174 [MQ-BROKER-SERVICE] INFO (KahaPersistenceAdapter.java:180) - Store is locked... waiting 10 seconds for the Store to be unlocked. > 2010-09-10 06:36:52,175 [MQ-BROKER-SERVICE] INFO (KahaStore.java:463) - Kaha Store using data directory /nfs/soleta/gdvt/activeMq > 2010-09-10 06:36:52,175 [MQ-BROKER-SERVICE] INFO (KahaPersistenceAdapter.java:180) - Store is locked... waiting 10 seconds for the Store to be unlocked. > 2010-09-10 06:37:02,175 [MQ-BROKER-SERVICE] INFO (KahaStore.java:463) - Kaha Store using data directory /nfs/soleta/gdvt/activeMq > 2010-09-10 06:37:02,176 [MQ-BROKER-SERVICE] INFO (KahaPersistenceAdapter.java:180) - Store is locked... waiting 10 seconds for the Store to be unlocked. > 2010-09-10 06:37:12,177 [MQ-BROKER-SERVICE] INFO (KahaStore.java:463) - Kaha Store using data directory /nfs/soleta/gdvt/activeMq > 2010-09-10 06:37:12,178 [MQ-BROKER-SERVICE] INFO (KahaPersistenceAdapter.java:180) - Store is locked... waiting 10 seconds for the Store to be unlocked. > Note: As you can see that the secondary server is trying to get the lock on /nfs/soleta/gdvt/activeMq file, it fails as the file has been locked by the primary server. So it tries to get the lock very 10 seconds. > 2. Log from lsof | grep activeMq > java 13855 madhu 84u REG 14,2 0 7519799 /nfs/soleta/gdvt/activeMq/lock > java 13855 madhu 85u REG 14,2 0 7519799 /nfs/soleta/gdvt/activeMq/lock > java 13855 madhu 86u REG 14,2 0 7519799 /nfs/soleta/gdvt/activeMq/lock > java 13855 madhu 87u REG 14,2 0 7519799 /nfs/soleta/gdvt/activeMq/lock > java 13855 madhu 94u REG 14,2 0 7519799 /nfs/soleta/gdvt/activeMq/lock > java 13855 madhu 95u REG 14,2 0 7519799 /nfs/soleta/gdvt/activeMq/lock > java 13855 madhu 96u REG 14,2 0 7519799 /nfs/soleta/gdvt/activeMq/lock > java 13855 madhu 97u REG 14,2 0 7519799 /nfs/soleta/gdvt/activeMq/lock > You can see that the file used for the lock is not getting closed. > 3. Log from lsof | grep activeMq | wc -l > The above cmd will give us the total number of open files for " /nfs/soleta/gdvt/activeMq/lock" . > et3-tk:~ madhu$ lsof | grep activeMq | wc -l > 56 > et3-tk:~ madhu$ lsof | grep activeMq | wc -l > 57 > et3-tk:~ madhu$ lsof | grep activeMq | wc -l > 58 > et3-tk:~ madhu$ lsof | grep activeMq | wc -l > 62 > > As you can see that the open files are increasing on "/nfs/soleta/gdvt/activeMq/lock" and the count doesn't come down. > > Note: The below changes were done only for learning purpose. > I was able to resolve this issue by changing version 5.3.0 activeMq-core > org.apache.activemq.kaha.impl.KahaStore line no 465 > I replace line 465 with the following code > try{ > lock(); > }catch(StoreLockedExcpetion e){ > LOG.error("Got error locking file",e); > lockFile.getChannel().close(); > lockFile.close(); > throw new StoreLockedExcpetion("Kaha Store " + directory.getName() + " is already opened by another application"); > } -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.