activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Rodrigues (JIRA)" <j...@apache.org>
Subject [jira] Commented: (AMQ-1264) Kaha crashes after database recovery looking for "data--2"
Date Mon, 28 Apr 2008 10:16:43 GMT

    [ https://issues.apache.org/activemq/browse/AMQ-1264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=42480#action_42480
] 

Daniel Rodrigues commented on AMQ-1264:
---------------------------------------

Hello, 

I think I am not supposed to re-open this issue, but the fact is, if there is a crash, recovery
takes a  huge amount of time. 

2008-04-28 11:32:21,123 [erSimpleAppMain] INFO  AMQPersistenceAdapter          - Recovered
722013 operations from redo log in 3801.997 seconds.

I understand this might have to be recovered once.  But, suppose we have a SAN master-slave
configuration, basing it on AMQStore: the fact it's taking more than half an hour to recover
renders this topology unusable! 


Moreover, it seems like kaha is not removing log files not containing any more active-data.

For instance, a recent log in a crash:
 
2008-04-28 10:28:59,126 [erSimpleAppMain] INFO  AMQPersistenceAdapter          - Journal Recovery
Started from: DataManager:(data-)
2008-04-28 10:28:59,146 [msg.performance] DEBUG AMQMessageStore                - Doing batch
update... adding: 0 removing: 0
2008-04-28 10:28:59,147 [msg.performance] DEBUG AMQMessageStore                - Batch update
done.
2008-04-28 10:29:02,798 [erSimpleAppMain] DEBUG AMQPersistenceAdapter          - TRACE Entry:
CHECKPOINT Tue Apr 15 13:49:33 CEST 2008
2008-04-28 10:29:56,556 [erSimpleAppMain] DEBUG AMQPersistenceAdapter          - TRACE Entry:
CHECKPOINT Tue Apr 15 13:50:33 CEST 2008
(Repeats same Tracing until) 
2008-04-28 10:32:07,354 [erSimpleAppMain] DEBUG AMQPersistenceAdapter          - TRACE Entry:
CHECKPOINT Thu Apr 17 15:23:33 CEST 2008
2008-04-28 10:32:07,354 [erSimpleAppMain] DEBUG AMQPersistenceAdapter          - TRACE Entry:
RECOVERED Thu Apr 17 15:26:50 CEST 2008
2008-04-28 10:32:07,354 [erSimpleAppMain] DEBUG AMQPersistenceAdapter          - TRACE Entry:
RECOVERED Thu Apr 17 16:04:16 CEST 2008
2008-04-28 10:32:07,354 [erSimpleAppMain] DEBUG AMQPersistenceAdapter          - TRACE Entry:
RECOVERED Fri Apr 18 10:38:25 CEST 2008
2008-04-28 10:32:07,354 [erSimpleAppMain] DEBUG AMQPersistenceAdapter          - TRACE Entry:
RECOVERED Fri Apr 18 14:43:28 CEST 2008
2008-04-28 10:32:07,355 [erSimpleAppMain] DEBUG AMQPersistenceAdapter          - TRACE Entry:
RECOVERED Fri Apr 18 16:52:44 CEST 2008
2008-04-28 10:32:07,355 [erSimpleAppMain] DEBUG AMQPersistenceAdapter          - TRACE Entry:
RECOVERED Fri Apr 18 17:00:09 CEST 2008
2008-04-28 10:32:11,196 [erSimpleAppMain] DEBUG AMQPersistenceAdapter          - TRACE Entry:
CHECKPOINT Wed Apr 23 11:41:09 CEST 2008
(Repeats same tracing until)
2008-04-28 11:32:21,091 [erSimpleAppMain] DEBUG AMQPersistenceAdapter          - TRACE Entry:
CHECKPOINT Mon Apr 28 10:26:09 CEST 2008
2008-04-28 11:32:21,091 [erSimpleAppMain] DEBUG AMQPersistenceAdapter          - TRACE Entry:
CHECKPOINT Mon Apr 28 10:27:09 CEST 2008
2008-04-28 11:32:21,123 [erSimpleAppMain] INFO  AMQPersistenceAdapter          - Recovered
722013 operations from redo log in 3801.997 secon
ds.
2008-04-28 11:32:21,123 [erSimpleAppMain] INFO  AMQPersistenceAdapter          - Finished
recovering the ReferenceStore



So, only very few operations were recovered, and most of the time the recovering time seems
to have been spent unecessarily.

Even assuming this is necessary once, why is it necessary to go back this much in time if
we failover again?


Thanks,
daniel

> Kaha crashes after database recovery looking for "data--2"
> ----------------------------------------------------------
>
>                 Key: AMQ-1264
>                 URL: https://issues.apache.org/activemq/browse/AMQ-1264
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker, Message Store
>    Affects Versions: 5.0.0
>         Environment: apache-activemq-4.2-20070602
> tom@light:~/activemq$ java -version
> java version "1.5.0"
> Java(TM) 2 Runtime Environment, Standard Edition (build diablo-1.5.0-b01)
> Java HotSpot(TM) Server VM (build diablo-1.5.0_07-b01, mixed mode)
>            Reporter: Tom Samplonius
>            Assignee: Rob Davies
>             Fix For: 5.0.0
>
>
>   There is a problem when using the default persistence settings, and ActiveMQ is shutdown
uncleanly with messages in a queue.  After a restart (and log recovery), and attempt to read
a message out of the queue, generates the following error (and ActiveMQ hangs).
>   It appears that there are too many "-" in a concatenation somewhere, as the logfiles
are "data-1", "data-2", etc, not "data--1", which is probably is what is generating the exception.
>   This is with 4.2-SNAPSHOT-20070602.
> ERROR AsyncDataManager               - Looking for key 1 but not found in fileMap: {3=data-3
number = 3 , length = 9051046 refCount = 3}
> ERROR QueueStorePrefetch             - Failed to fill batch
> java.io.IOException: Failed to read to journal for: offset = 20785811, file = 1, size
= -1, type = 0. Reason: java.io.IOException: Could not locate data file data--1
>         at org.apache.activemq.util.IOExceptionSupport.create(IOExceptionSupport.java:31)
>         at org.apache.activemq.store.amq.AMQPersistenceAdapter.createReadException(AMQPersistenceAdapter.java:529)
>         at org.apache.activemq.store.amq.AMQPersistenceAdapter.readCommand(AMQPersistenceAdapter.java:409)
>         at org.apache.activemq.store.amq.AMQMessageStore.getMessage(AMQMessageStore.java:382)
>         at org.apache.activemq.store.amq.RecoveryListenerAdapter.recoverMessageReference(RecoveryListenerAdapter.java:52)
>         at org.apache.activemq.store.kahadaptor.KahaReferenceStore.recover(KahaReferenceStore.java:63)
>         at org.apache.activemq.store.kahadaptor.KahaReferenceStore.recoverNextMessages(KahaReferenceStore.java:89)
>         at org.apache.activemq.store.amq.AMQMessageStore.recoverNextMessages(AMQMessageStore.java:477)
>         at org.apache.activemq.broker.region.cursors.QueueStorePrefetch.fillBatch(QueueStorePrefetch.java:159)
>         at org.apache.activemq.broker.region.cursors.QueueStorePrefetch.hasNext(QueueStorePrefetch.java:110)
>         at org.apache.activemq.broker.region.cursors.StoreQueueCursor.hasNext(StoreQueueCursor.java:122)
>         at org.apache.activemq.broker.region.Queue.doPageIn(Queue.java:1007)
>         at org.apache.activemq.broker.region.Queue.doPageIn(Queue.java:992)
>         at org.apache.activemq.broker.region.Queue.addSubscription(Queue.java:215)
>         at org.apache.activemq.broker.region.AbstractRegion.addConsumer(AbstractRegion.java:241)
>         at org.apache.activemq.broker.region.RegionBroker.addConsumer(RegionBroker.java:341)
>         at org.apache.activemq.broker.BrokerFilter.addConsumer(BrokerFilter.java:87)
>         at org.apache.activemq.advisory.AdvisoryBroker.addConsumer(AdvisoryBroker.java:79)
>         at org.apache.activemq.broker.BrokerFilter.addConsumer(BrokerFilter.java:87)
>         at org.apache.activemq.broker.MutableBrokerFilter.addConsumer(MutableBrokerFilter.java:97)
>         at org.apache.activemq.broker.TransportConnection.processAddConsumer(TransportConnection.java:566)
>         at org.apache.activemq.command.ConsumerInfo.visit(ConsumerInfo.java:313)
>         at org.apache.activemq.broker.TransportConnection.service(TransportConnection.java:282)
>         at org.apache.activemq.broker.TransportConnection$1.onCommand(TransportConnection.java:188)
>         at org.apache.activemq.transport.TransportFilter.onCommand(TransportFilter.java:65)
>         at org.apache.activemq.transport.stomp.StompTransportFilter.sendToActiveMQ(StompTransportFilter.java:81)
>         at org.apache.activemq.transport.stomp.ProtocolConverter.sendToActiveMQ(ProtocolConverter.java:119)
>         at org.apache.activemq.transport.stomp.ProtocolConverter.onStompSubscribe(ProtocolConverter.java:351)
>         at org.apache.activemq.transport.stomp.ProtocolConverter.onStompCommad(ProtocolConverter.java:149)
>         at org.apache.activemq.transport.stomp.StompTransportFilter.onCommand(StompTransportFilter.java:71)
>         at org.apache.activemq.transport.TransportSupport.doConsume(TransportSupport.java:84)
>         at org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:137)
>         at java.lang.Thread.run(Thread.java:595)
> Caused by: java.io.IOException: Could not locate data file data--1
>         at org.apache.activemq.kaha.impl.async.AsyncDataManager.getDataFile(AsyncDataManager.java:259)
>         at org.apache.activemq.kaha.impl.async.AsyncDataManager.read(AsyncDataManager.java:477)
>         at org.apache.activemq.store.amq.AMQPersistenceAdapter.readCommand(AMQPersistenceAdapter.java:406)
>         ... 30 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message