activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Madhu Krishna (JIRA)" <j...@apache.org>
Subject [jira] Updated: (AMQ-2916) KahaDb not closing opend file after lock check in secondery server: shared file system mode
Date Mon, 13 Sep 2010 20:06:40 GMT

     [ https://issues.apache.org/activemq/browse/AMQ-2916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Madhu Krishna updated AMQ-2916:
-------------------------------

    Description: 
This For KahaDb store.
>From the logs it looks like activeMq shared file system broker has a leak in RandomAccessfile
lock check. On a lock fail it is not closing the opened files. This is causing the secondary
server to run out of number of open files set in the OS. 
Current OS is Set to 1024 max open files.

I have been running some tests to validate this and the following is the observations. 
1. Log from secondary server: 
2010-09-10 06:36:12,157 [MQ-BROKER-SERVICE] INFO  (KahaStore.java:463) - Kaha Store using
data directory /nfs/soleta/gdvt/activeMq 
2010-09-10 06:36:12,157 [MQ-BROKER-SERVICE] INFO  (KahaPersistenceAdapter.java:180) - Store
is locked... waiting 10 seconds for the Store to be unlocked. 
2010-09-10 06:36:22,159 [MQ-BROKER-SERVICE] INFO  (KahaStore.java:463) - Kaha Store using
data directory /nfs/soleta/gdvt/activeMq 
2010-09-10 06:36:22,161 [MQ-BROKER-SERVICE] INFO  (KahaPersistenceAdapter.java:180) - Store
is locked... waiting 10 seconds for the Store to be unlocked. 
2010-09-10 06:36:32,161 [MQ-BROKER-SERVICE] INFO  (KahaStore.java:463) - Kaha Store using
data directory /nfs/soleta/gdvt/activeMq 
2010-09-10 06:36:32,162 [MQ-BROKER-SERVICE] INFO  (KahaPersistenceAdapter.java:180) - Store
is locked... waiting 10 seconds for the Store to be unlocked. 
2010-09-10 06:36:42,162 [MQ-BROKER-SERVICE] INFO  (KahaStore.java:463) - Kaha Store using
data directory /nfs/soleta/gdvt/activeMq 
2010-09-10 06:36:42,174 [MQ-BROKER-SERVICE] INFO  (KahaPersistenceAdapter.java:180) - Store
is locked... waiting 10 seconds for the Store to be unlocked. 
2010-09-10 06:36:52,175 [MQ-BROKER-SERVICE] INFO  (KahaStore.java:463) - Kaha Store using
data directory /nfs/soleta/gdvt/activeMq 
2010-09-10 06:36:52,175 [MQ-BROKER-SERVICE] INFO  (KahaPersistenceAdapter.java:180) - Store
is locked... waiting 10 seconds for the Store to be unlocked. 
2010-09-10 06:37:02,175 [MQ-BROKER-SERVICE] INFO  (KahaStore.java:463) - Kaha Store using
data directory /nfs/soleta/gdvt/activeMq 
2010-09-10 06:37:02,176 [MQ-BROKER-SERVICE] INFO  (KahaPersistenceAdapter.java:180) - Store
is locked... waiting 10 seconds for the Store to be unlocked. 
2010-09-10 06:37:12,177 [MQ-BROKER-SERVICE] INFO  (KahaStore.java:463) - Kaha Store using
data directory /nfs/soleta/gdvt/activeMq 
2010-09-10 06:37:12,178 [MQ-BROKER-SERVICE] INFO  (KahaPersistenceAdapter.java:180) - Store
is locked... waiting 10 seconds for the Store to be unlocked. 
Note: As you can see that the secondary server is trying to get the lock on  /nfs/soleta/gdvt/activeMq
file, it fails as the file has been locked by the primary server. So it tries to get the lock
very 10 seconds. 
2. Log from lsof | grep activeMq 
java      13855 madhu   84u     REG       14,2         0 7519799 /nfs/soleta/gdvt/activeMq/lock

java      13855 madhu   85u     REG       14,2         0 7519799 /nfs/soleta/gdvt/activeMq/lock

java      13855 madhu   86u     REG       14,2         0 7519799 /nfs/soleta/gdvt/activeMq/lock

java      13855 madhu   87u     REG       14,2         0 7519799 /nfs/soleta/gdvt/activeMq/lock

java      13855 madhu   94u     REG       14,2         0 7519799 /nfs/soleta/gdvt/activeMq/lock

java      13855 madhu   95u     REG       14,2         0 7519799 /nfs/soleta/gdvt/activeMq/lock

java      13855 madhu   96u     REG       14,2         0 7519799 /nfs/soleta/gdvt/activeMq/lock

java      13855 madhu   97u     REG       14,2         0 7519799 /nfs/soleta/gdvt/activeMq/lock

You can see that the file used for the lock is not getting closed. 
3. Log from lsof | grep activeMq | wc -l 
The above cmd will give us the total number of open files for " /nfs/soleta/gdvt/activeMq/lock"
. 
et3-tk:~ madhu$ lsof | grep activeMq | wc -l 
      56 
et3-tk:~ madhu$ lsof | grep activeMq | wc -l 
      57 
et3-tk:~ madhu$ lsof | grep activeMq | wc -l 
      58 
et3-tk:~ madhu$ lsof | grep activeMq | wc -l 
      62 
 
As you can see that the open files are increasing on "/nfs/soleta/gdvt/activeMq/lock" and
the count doesn't come down. 
 

Note: The below changes were done only for learning purpose. 
I was able to resolve this issue by changing version 5.3.0 activeMq-core 
org.apache.activemq.kaha.impl.KahaStore line no 465
I replace line 465 with the following code 
         try{
            	lock();
            }catch(StoreLockedExcpetion e){
            	LOG.error("Got error locking file",e);
                lockFile.getChannel().close();
                lockFile.close();
            	  throw new StoreLockedExcpetion("Kaha Store " + directory.getName() + "  is
already opened by another application");
            }


  was:
This For KahaDb store.
>From the logs it looks like activeMq shared file system broker has a leak in RandomAccessfile
lock check. On a lock fail it is not closing the opened files. This is causing the secondary
server to run out of number of open files set in the OS. 
Current OS is Set to 1024 max open files.

I have been running some tests to validate this and the following is the observations. 
1. Log from secondary server: 
2010-09-10 06:36:12,157 [MQ-BROKER-SERVICE] INFO  (KahaStore.java:463) - Kaha Store using
data directory /nfs/soleta/gdvt/activeMq 
2010-09-10 06:36:12,157 [MQ-BROKER-SERVICE] INFO  (KahaPersistenceAdapter.java:180) - Store
is locked... waiting 10 seconds for the Store to be unlocked. 
2010-09-10 06:36:22,159 [MQ-BROKER-SERVICE] INFO  (KahaStore.java:463) - Kaha Store using
data directory /nfs/soleta/gdvt/activeMq 
2010-09-10 06:36:22,161 [MQ-BROKER-SERVICE] INFO  (KahaPersistenceAdapter.java:180) - Store
is locked... waiting 10 seconds for the Store to be unlocked. 
2010-09-10 06:36:32,161 [MQ-BROKER-SERVICE] INFO  (KahaStore.java:463) - Kaha Store using
data directory /nfs/soleta/gdvt/activeMq 
2010-09-10 06:36:32,162 [MQ-BROKER-SERVICE] INFO  (KahaPersistenceAdapter.java:180) - Store
is locked... waiting 10 seconds for the Store to be unlocked. 
2010-09-10 06:36:42,162 [MQ-BROKER-SERVICE] INFO  (KahaStore.java:463) - Kaha Store using
data directory /nfs/soleta/gdvt/activeMq 
2010-09-10 06:36:42,174 [MQ-BROKER-SERVICE] INFO  (KahaPersistenceAdapter.java:180) - Store
is locked... waiting 10 seconds for the Store to be unlocked. 
2010-09-10 06:36:52,175 [MQ-BROKER-SERVICE] INFO  (KahaStore.java:463) - Kaha Store using
data directory /nfs/soleta/gdvt/activeMq 
2010-09-10 06:36:52,175 [MQ-BROKER-SERVICE] INFO  (KahaPersistenceAdapter.java:180) - Store
is locked... waiting 10 seconds for the Store to be unlocked. 
2010-09-10 06:37:02,175 [MQ-BROKER-SERVICE] INFO  (KahaStore.java:463) - Kaha Store using
data directory /nfs/soleta/gdvt/activeMq 
2010-09-10 06:37:02,176 [MQ-BROKER-SERVICE] INFO  (KahaPersistenceAdapter.java:180) - Store
is locked... waiting 10 seconds for the Store to be unlocked. 
2010-09-10 06:37:12,177 [MQ-BROKER-SERVICE] INFO  (KahaStore.java:463) - Kaha Store using
data directory /nfs/soleta/gdvt/activeMq 
2010-09-10 06:37:12,178 [MQ-BROKER-SERVICE] INFO  (KahaPersistenceAdapter.java:180) - Store
is locked... waiting 10 seconds for the Store to be unlocked. 
Note: As you can see that the secondary server is trying to get the lock on  /nfs/soleta/gdvt/activeMq
file, it fails as the file has been locked by the primary server. So it tries to get the lock
very 10 seconds. 
2. Log from lsof | grep activeMq 
java      13855 madhu   84u     REG       14,2         0 7519799 /nfs/soleta/gdvt/activeMq/lock

java      13855 madhu   85u     REG       14,2         0 7519799 /nfs/soleta/gdvt/activeMq/lock

java      13855 madhu   86u     REG       14,2         0 7519799 /nfs/soleta/gdvt/activeMq/lock

java      13855 madhu   87u     REG       14,2         0 7519799 /nfs/soleta/gdvt/activeMq/lock

java      13855 madhu   94u     REG       14,2         0 7519799 /nfs/soleta/gdvt/activeMq/lock

java      13855 madhu   95u     REG       14,2         0 7519799 /nfs/soleta/gdvt/activeMq/lock

java      13855 madhu   96u     REG       14,2         0 7519799 /nfs/soleta/gdvt/activeMq/lock

java      13855 madhu   97u     REG       14,2         0 7519799 /nfs/soleta/gdvt/activeMq/lock

You can see that the file used for the lock is not getting closed. 
3. Log from lsof | grep activeMq | wc -l 
The above cmd will give us the total number of open files for " /nfs/soleta/gdvt/activeMq/lock"
. 
et3-tk:~ madhu$ lsof | grep activeMq | wc -l 
      56 
et3-tk:~ madhu$ lsof | grep activeMq | wc -l 
      57 
et3-tk:~ madhu$ lsof | grep activeMq | wc -l 
      58 
et3-tk:~ madhu$ lsof | grep activeMq | wc -l 
      62 
 
As you can see that the open files are increasing on "/nfs/soleta/gdvt/activeMq/lock" and
the count doesn't come down. 
 




> KahaDb not closing opend file after lock check in secondery server: shared file system
mode
> -------------------------------------------------------------------------------------------
>
>                 Key: AMQ-2916
>                 URL: https://issues.apache.org/activemq/browse/AMQ-2916
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 5.3.0
>         Environment: Mac os x 10.6 which NFS file system for shared file system
>            Reporter: Madhu Krishna
>            Priority: Critical
>
> This For KahaDb store.
> From the logs it looks like activeMq shared file system broker has a leak in RandomAccessfile
lock check. On a lock fail it is not closing the opened files. This is causing the secondary
server to run out of number of open files set in the OS. 
> Current OS is Set to 1024 max open files.
> I have been running some tests to validate this and the following is the observations.

> 1. Log from secondary server: 
> 2010-09-10 06:36:12,157 [MQ-BROKER-SERVICE] INFO  (KahaStore.java:463) - Kaha Store using
data directory /nfs/soleta/gdvt/activeMq 
> 2010-09-10 06:36:12,157 [MQ-BROKER-SERVICE] INFO  (KahaPersistenceAdapter.java:180) -
Store is locked... waiting 10 seconds for the Store to be unlocked. 
> 2010-09-10 06:36:22,159 [MQ-BROKER-SERVICE] INFO  (KahaStore.java:463) - Kaha Store using
data directory /nfs/soleta/gdvt/activeMq 
> 2010-09-10 06:36:22,161 [MQ-BROKER-SERVICE] INFO  (KahaPersistenceAdapter.java:180) -
Store is locked... waiting 10 seconds for the Store to be unlocked. 
> 2010-09-10 06:36:32,161 [MQ-BROKER-SERVICE] INFO  (KahaStore.java:463) - Kaha Store using
data directory /nfs/soleta/gdvt/activeMq 
> 2010-09-10 06:36:32,162 [MQ-BROKER-SERVICE] INFO  (KahaPersistenceAdapter.java:180) -
Store is locked... waiting 10 seconds for the Store to be unlocked. 
> 2010-09-10 06:36:42,162 [MQ-BROKER-SERVICE] INFO  (KahaStore.java:463) - Kaha Store using
data directory /nfs/soleta/gdvt/activeMq 
> 2010-09-10 06:36:42,174 [MQ-BROKER-SERVICE] INFO  (KahaPersistenceAdapter.java:180) -
Store is locked... waiting 10 seconds for the Store to be unlocked. 
> 2010-09-10 06:36:52,175 [MQ-BROKER-SERVICE] INFO  (KahaStore.java:463) - Kaha Store using
data directory /nfs/soleta/gdvt/activeMq 
> 2010-09-10 06:36:52,175 [MQ-BROKER-SERVICE] INFO  (KahaPersistenceAdapter.java:180) -
Store is locked... waiting 10 seconds for the Store to be unlocked. 
> 2010-09-10 06:37:02,175 [MQ-BROKER-SERVICE] INFO  (KahaStore.java:463) - Kaha Store using
data directory /nfs/soleta/gdvt/activeMq 
> 2010-09-10 06:37:02,176 [MQ-BROKER-SERVICE] INFO  (KahaPersistenceAdapter.java:180) -
Store is locked... waiting 10 seconds for the Store to be unlocked. 
> 2010-09-10 06:37:12,177 [MQ-BROKER-SERVICE] INFO  (KahaStore.java:463) - Kaha Store using
data directory /nfs/soleta/gdvt/activeMq 
> 2010-09-10 06:37:12,178 [MQ-BROKER-SERVICE] INFO  (KahaPersistenceAdapter.java:180) -
Store is locked... waiting 10 seconds for the Store to be unlocked. 
> Note: As you can see that the secondary server is trying to get the lock on  /nfs/soleta/gdvt/activeMq
file, it fails as the file has been locked by the primary server. So it tries to get the lock
very 10 seconds. 
> 2. Log from lsof | grep activeMq 
> java      13855 madhu   84u     REG       14,2         0 7519799 /nfs/soleta/gdvt/activeMq/lock

> java      13855 madhu   85u     REG       14,2         0 7519799 /nfs/soleta/gdvt/activeMq/lock

> java      13855 madhu   86u     REG       14,2         0 7519799 /nfs/soleta/gdvt/activeMq/lock

> java      13855 madhu   87u     REG       14,2         0 7519799 /nfs/soleta/gdvt/activeMq/lock

> java      13855 madhu   94u     REG       14,2         0 7519799 /nfs/soleta/gdvt/activeMq/lock

> java      13855 madhu   95u     REG       14,2         0 7519799 /nfs/soleta/gdvt/activeMq/lock

> java      13855 madhu   96u     REG       14,2         0 7519799 /nfs/soleta/gdvt/activeMq/lock

> java      13855 madhu   97u     REG       14,2         0 7519799 /nfs/soleta/gdvt/activeMq/lock

> You can see that the file used for the lock is not getting closed. 
> 3. Log from lsof | grep activeMq | wc -l 
> The above cmd will give us the total number of open files for " /nfs/soleta/gdvt/activeMq/lock"
. 
> et3-tk:~ madhu$ lsof | grep activeMq | wc -l 
>       56 
> et3-tk:~ madhu$ lsof | grep activeMq | wc -l 
>       57 
> et3-tk:~ madhu$ lsof | grep activeMq | wc -l 
>       58 
> et3-tk:~ madhu$ lsof | grep activeMq | wc -l 
>       62 
>  
> As you can see that the open files are increasing on "/nfs/soleta/gdvt/activeMq/lock"
and the count doesn't come down. 
>  
> Note: The below changes were done only for learning purpose. 
> I was able to resolve this issue by changing version 5.3.0 activeMq-core 
> org.apache.activemq.kaha.impl.KahaStore line no 465
> I replace line 465 with the following code 
>          try{
>             	lock();
>             }catch(StoreLockedExcpetion e){
>             	LOG.error("Got error locking file",e);
>                 lockFile.getChannel().close();
>                 lockFile.close();
>             	  throw new StoreLockedExcpetion("Kaha Store " + directory.getName() + "
 is already opened by another application");
>             }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message