activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias Vetter (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AMQ-4171) KahaDB / Journal rotateWriteFIle() sometimes too fast?
Date Tue, 13 Nov 2012 15:28:13 GMT

    [ https://issues.apache.org/jira/browse/AMQ-4171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13496252#comment-13496252
] 

Matthias Vetter commented on AMQ-4171:
--------------------------------------

Hi Timothy, 
you are right, and I am sorry for having created an issue entry too early. Nevertheless, the
delay really helped for some time. But today we detected an fileNotFound Exception inspite
the delay, eventually. 
So we have to keep on searching for the real problem..
Thanks! 

Best regards, 
Matthias Vetter


                
> KahaDB / Journal rotateWriteFIle() sometimes too fast?
> ------------------------------------------------------
>
>                 Key: AMQ-4171
>                 URL: https://issues.apache.org/jira/browse/AMQ-4171
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 5.6.0
>         Environment: Tested on Windows (7)
>            Reporter: Matthias Vetter
>
> On very fast test machines we got predictable FileNotFoundExceptions when a KahaDB data
file was to be replaced by its successor.
> See this from the log file:
> 30.10.2012 21:00:56.359 [ActiveMQ Transport: tcp:///192.168.111.128:3134] [  ]  master-My-Company-S1-kellner20-vm
Message ID:kellner20-vm-3125-1351618296390-23:1:1:2086:1 sent to queue://My-Company.dxm.request.provisionToTS
> 30.10.2012 21:00:56.359 [BrokerService[master-My-Company-S1-kellner20-vm] Task-197] [
 ]  My-Company.dxm.request.provisionToTS toPageIn: 1, Inflight: 0, pagedInMessages.size 0,
enqueueCount: 2125, dequeueCount: 2124
> -------------------------------------------------------------------------------
> 30.10.2012 21:00:56.390 [ConcurrentQueueStoreAndDispatch] [  ] *** SEVERE ***
> Called from org.apache.activemq.store.kahadb.MessageDatabase.store()
>  KahaDB failed to store to Journal
> -------------------------------------------------------------------------------
> java.io.FileNotFoundException: activemq-data\master-My-Company-S1-kellner20-vm\KahaDB\db-2.log
(The system cannot find the path specified)
> 	at java.io.RandomAccessFile.open(Native Method)
> 	at java.io.RandomAccessFile.<init>(Unknown Source)
> 	at org.apache.kahadb.journal.DataFile.openRandomAccessFile(DataFile.java:70)
> 	at org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:309)
> 	at org.apache.kahadb.journal.DataFileAppender$1.run(DataFileAppender.java:188)
> 30.10.2012 21:00:56.375 [BrokerService[master-My-Company-S1-kellner20-vm] Task-197] [
 ]  org.apache.activemq.broker.region.cursors.QueueStorePrefetch@1afd356:My-Company.dxm.request.provisionToTS,batchResetNeeded=false,storeHasMessages=true,size=0,cacheEnabled=true
- fillBatch
> 30.10.2012 21:00:56.421 [ConcurrentQueueStoreAndDispatch] [  ] NOTIFICATION:  HighWater
of SEVERE exceeded (value=1, highwater=1)!
> 30.10.2012 21:00:56.437 [ConcurrentQueueStoreAndDispatch] [  ]  Stopping the broker due
to IO exception, java.io.FileNotFoundException: activemq-data\master-My-Company-S1-kellner20-vm\KahaDB\db-2.log
(The system cannot find the path specified)
> java.io.FileNotFoundException: activemq-data\master-My-Company-S1-kellner20-vm\KahaDB\db-2.log
(The system cannot find the path specified)
> 	at java.io.RandomAccessFile.open(Native Method)
> 	at java.io.RandomAccessFile.<init>(Unknown Source)
> 	at org.apache.kahadb.journal.DataFile.openRandomAccessFile(DataFile.java:70)
> 	at org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:309)
> 	at org.apache.kahadb.journal.DataFileAppender$1.run(DataFileAppender.java:188)
> ********************
> (end of log file quotation)
> ********************
> Having evaluated all possibilities of the failure, we finally patched the following method
of the class Journal::
> synchronized DataFile rotateWriteFile() {
>         int nextNum = !dataFiles.isEmpty() ? dataFiles.getTail().getDataFileId().intValue()
+ 1 : 1;
>         File file = getFile(nextNum);
>         DataFile nextWriteFile = new DataFile(file, nextNum, preferedFileLength);
> PATCH START       try {
>    			Thread.sleep(10000);
> 		} catch (InterruptedException e) {
> 			e.printStackTrace();
> 		}
> PATCH END
>         // actually allocate the disk space
>         fileMap.put(nextWriteFile.getDataFileId(), nextWriteFile);
>         fileByFileMap.put(file, nextWriteFile);
>         dataFiles.addLast(nextWriteFile);
>         return nextWriteFile;
>     }
> *************
> This helped, the IO Exception never occurred again after this patch.
> It seems that the "nextWriteFile"-object was  returned, while the allocating of the new
Data File wasn't finished yet. 
> Anyway, it is very strange that we should have detected this problem, and it never occurred
earlier - at least no similar bug (???) seems to have been reported. Therefore, a critical
examination of possibly deeper causes could be useful.
> Best regards, 
> Matthias Vetter
> ATOS, Frankfurt

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message