Return-Path: X-Original-To: apmail-activemq-users-archive@www.apache.org Delivered-To: apmail-activemq-users-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 11AB86E6D for ; Wed, 27 Jul 2011 08:46:31 +0000 (UTC) Received: (qmail 95760 invoked by uid 500); 27 Jul 2011 08:40:19 -0000 Delivered-To: apmail-activemq-users-archive@activemq.apache.org Received: (qmail 95593 invoked by uid 500); 27 Jul 2011 08:40:09 -0000 Mailing-List: contact users-help@activemq.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@activemq.apache.org Delivered-To: mailing list users@activemq.apache.org Received: (qmail 94865 invoked by uid 99); 27 Jul 2011 08:40:04 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Jul 2011 08:40:04 +0000 X-ASF-Spam-Status: No, hits=3.1 required=5.0 tests=FREEMAIL_FROM,FREEMAIL_REPLY,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL,URI_HEX X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of joe.carter@gmail.com designates 209.85.210.171 as permitted sender) Received: from [209.85.210.171] (HELO mail-iy0-f171.google.com) (209.85.210.171) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Jul 2011 08:39:59 +0000 Received: by iyi12 with SMTP id 12so1932215iyi.2 for ; Wed, 27 Jul 2011 01:39:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=SNMIawI2elRNNAxgOwFbRk2Wp+qmSbWmoOnjqpjruM0=; b=Mly6M3+69klfU2ntv4lpWzQBQ/wk2UNfLPg3gGc2jRDZJYkSdHnxqzDRgmtdjFKf4R bK7DFigZdEvLsLOIeUk6s6OgdvkByiYXEFkz6nXNDqM/79bOPowvefcTRAxf3qE+5R6/ lOIrbJH7ygMIZrKnfSHbbmfBJi0cYipSjMW/c= MIME-Version: 1.0 Received: by 10.42.161.131 with SMTP id t3mr291040icx.404.1311755978972; Wed, 27 Jul 2011 01:39:38 -0700 (PDT) Received: by 10.231.10.129 with HTTP; Wed, 27 Jul 2011 01:39:38 -0700 (PDT) In-Reply-To: References: <1311679528547-3695392.post@n4.nabble.com> Date: Wed, 27 Jul 2011 09:39:38 +0100 Message-ID: Subject: Re: KahaDB corruption From: Joe Carter To: users@activemq.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Thanks Gary - I'd moved from embedded to a separate broker and had missed this completely. For reference I added the following to conf/activemq.xml ignoreMissingJournalfiles=3D"true" checkForCorruptJournalFiles=3D"true" checksumJournalFiles=3D"true" /> to give I pushed the corrupted database back in place and it recovered correctly. 2011-07-27 09:31:19,644 | INFO | Recovery replayed 2 operations from the journal in 0.036 seconds. | org.apache.activemq.store.kahadb.MessageDatabase | main 2011-07-27 09:31:19,649 | INFO | Detected missing/corrupt journal files. Dropped 5 messages from the index in 0.0030 seconds. | org.apache.activemq.store.kahadb.MessageDatabase | main So I can confirm the recovery code is working for me. Thanks. Joe On 26 July 2011 12:59, Gary Tully wrote: > The flags: checksumJournalFiles, checkForCorruptJournalFiles and > ignoreMissingJournalfiles are designed to for this use case. Have you > those enabled? > > http://activemq.apache.org/kahadb.html > > On 26 July 2011 12:25, JoeC wrote: >> I'm currently on 5.5.0 and ran into a different and unrecoverable kahadb >> case. >> I ran the system out of diskspace and not unreasonably activemq didn't l= ike >> it. >> After freeing up some space I ran into database corruption as follows. >> 2011-07-26 10:00:23,316 | INFO =A0| Corrupt journal records found in >> '/opt/ivb/apache-activemq-5.5.0/data/kahadb/db-326.log' between offsets: >> 19460423-21031378 | org.apache.kahadb.journal.Journal | main >> ... >> 2011-07-26 10:00:23,826 | INFO =A0| Recovering from the journal ... | >> org.apache.activemq.store.kahadb.MessageDatabase | main >> 2011-07-26 10:00:23,953 | ERROR | Failed to start ActiveMQ JMS Message >> Broker. Reason: org.apache.activemq.protobuf.InvalidProtocolBufferExcept= ion: >> Protocol message contained an invalid tag (zero). | >> org.apache.activemq.broker.BrokerService | main >> >> Removing the db.data made no difference. >> I then removed the db-326.log file and restarted twice. >> The first it complains about not finding db-326.log. >> The second time is uses a newly created db-1.log. >> >> Fortunately this was not a production environment, so the data doesn't >> matter however I would like a way of recovering the data. This could eve= n be >> an offline process. >> i.e. I quickly reset the database to restore service and then push in th= e >> older messages later. >> My application domain is somewhat tolerant of that approach but it is no= t >> tolerant of extended outages. >> For me, I'd rather (temporarily) lose some data than have a long outage = so a >> fully automated recovery is what I'd ideally like irrespective of >> corruption. >> >> Cheers >> Joe >> >> >> JoeC wrote: >>> >>> I've upgraded to 5.4.2 and will let you know how it goes. >>> I didn't rebuild the index as I've already restarted the process. >>> In normal operation the queues should be empty for our application so >>> that was not an issue for me. >>> >>> Thanks >>> Joe >>> >>> On 23 February 2011 18:06, Gary Tully <gary.tully@gmail.com> wrot= e: >>>> 5.4.2 is better w.r.t abortive shutdown, but for this case, rebuilding >>>> the index should work. >>>> remove kahadb/db.data and restart, it will parse the journal to >>>> rebuild the index. >>>> >>> >> >> >> -- >> View this message in context: http://activemq.2283324.n4.nabble.com/Kaha= DB-corruption-tp3321382p3695392.html >> Sent from the ActiveMQ - User mailing list archive at Nabble.com. >> > > > > -- > http://fusesource.com > http://blog.garytully.com >