activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gary Tully (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AMQ-5658) ActiveMQ will not start after KahaDB Corruption due to "Protocol message contained an invalid tag (zero)" error
Date Thu, 12 Mar 2015 13:55:38 GMT

    [ https://issues.apache.org/jira/browse/AMQ-5658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14358690#comment-14358690
] 

Gary Tully commented on AMQ-5658:
---------------------------------

are you using checkForCorruptJournalFiles and checksumJournalFiles?

If you are, any corrupt journal regions should be removed from the index so should not be
attempted to be read.

Please add your kahadb configuration and maybe a full stack trace to the exceptions. There
may be a case for dealing
with the exception at a lower level.

Unfortunately there is not much in the line of tooling that can help in this case.

> ActiveMQ will not start after KahaDB Corruption due to "Protocol message contained an
invalid tag (zero)" error
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: AMQ-5658
>                 URL: https://issues.apache.org/jira/browse/AMQ-5658
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Message Store
>    Affects Versions: 5.10.0
>         Environment: Windows 7
>            Reporter: Paul Manning
>              Labels: corruption, journal, kahaDB, messageStore, protobuf
>
> We experienced an ActiveMQ crash where the KahaDB data files where corrupted. The machine
was powered down abruptly (pull the plug).
> When the machine restarted, ActiveMQ would not start and the following entries were in
the activemq.log:
> 2015-03-05 09:25:46,791 | INFO  | Corrupt journal records found in 'c:\work\09_git\vc-core\vc-server\build\data\kahadb\db-131.log'
between offsets: 31054572..31231936 | org.apache.activemq.store.kahadb.disk.journal.Journal
| WrapperSimpleAppMain
> followed eventually by: 
> 2015-03-05 09:25:48,375 | ERROR | Failed to start Apache ActiveMQ ([broker-USATL-L-008043.americas.abb.com-0,
null], org.apache.activemq.protobuf.InvalidProtocolBufferException: Protocol message contained
an invalid tag (zero).) | org.apache.activemq.broker.BrokerService | WrapperSimpleAppMain
> Removing the .data files and the corrupted db-131.log file allows ActiveMQ to restart.
However, in that case, we experience message loss. 
> Is it possible to only lose the corrupted record instead of the whole data file? 
> Tracing through the code, it does not appear that there is any attempt to catch the InvalidProtocolBufferException
exception and discard the corrupted record. The exception is raised from CodedInputStream.readTag()
during the MessageDatabase.recover() process.
> It is worth noting that we have not been able to reproduce this error. I imagine that
this type of corruption is rare, but is there any way for a user to recover from this. Any
tools, etc.?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message