Return-Path: X-Original-To: apmail-activemq-dev-archive@www.apache.org Delivered-To: apmail-activemq-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 165B6E6E8 for ; Fri, 1 Mar 2013 16:07:14 +0000 (UTC) Received: (qmail 36361 invoked by uid 500); 1 Mar 2013 16:07:13 -0000 Delivered-To: apmail-activemq-dev-archive@activemq.apache.org Received: (qmail 36260 invoked by uid 500); 1 Mar 2013 16:07:13 -0000 Mailing-List: contact dev-help@activemq.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@activemq.apache.org Delivered-To: mailing list dev@activemq.apache.org Received: (qmail 35981 invoked by uid 99); 1 Mar 2013 16:07:12 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Mar 2013 16:07:12 +0000 Date: Fri, 1 Mar 2013 16:07:12 +0000 (UTC) From: "Ken Barber (JIRA)" To: dev@activemq.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (AMQ-4357) Corrupt KahaDB Index may cause EOFException on first message MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Ken Barber created AMQ-4357: ------------------------------- Summary: Corrupt KahaDB Index may cause EOFException on first message Key: AMQ-4357 URL: https://issues.apache.org/jira/browse/AMQ-4357 Project: ActiveMQ Issue Type: Bug Components: Message Store Affects Versions: 5.8.0 Environment: Java 1.6.0 most releases, tested on lots of different hardware, tested on Linux distros only. Reporter: Ken Barber Similar to AMQ-4339 If the index is corrupted, ActiveMQ may receive an EOFException when the first message is published. The easiest way to replicate this for us has been to purposely put junk into a journal as so: truncate -s 32700 db.data The exception in this case has been (this error is specific to AMQ 5.6.0, so lines may mismatch on 5.8.0, but similar problem was confirmed on 5.8.0): {code} 2013-03-01 16:05:06,451 ERROR [kahadb.MessageDatabase] KahaDB failed to store to Journal java.io.EOFException at java.io.RandomAccessFile.readFully(RandomAccessFile.java:399) at java.io.RandomAccessFile.readFully(RandomAccessFile.java:377) at org.apache.kahadb.page.PageFile.readPage(PageFile.java:867) at org.apache.kahadb.page.Transaction$2.readPage(Transaction.java:446) at org.apache.kahadb.page.Transaction$2.(Transaction.java:437) at org.apache.kahadb.page.Transaction.openInputStream(Transaction.java:434) at org.apache.kahadb.page.Transaction.load(Transaction.java:410) at org.apache.kahadb.page.Transaction.load(Transaction.java:367) at org.apache.kahadb.index.BTreeIndex.loadNode(BTreeIndex.java:262) at org.apache.kahadb.index.BTreeIndex.getRoot(BTreeIndex.java:174) at org.apache.kahadb.index.BTreeIndex.put(BTreeIndex.java:189) at org.apache.activemq.store.kahadb.MessageDatabase.upadateIndex(MessageDatabase.java:1197) at org.apache.activemq.store.kahadb.MessageDatabase$14.execute(MessageDatabase.java:1027) at org.apache.kahadb.page.Transaction.execute(Transaction.java:769) at org.apache.activemq.store.kahadb.MessageDatabase.process(MessageDatabase.java:1025) at org.apache.activemq.store.kahadb.MessageDatabase$13.visit(MessageDatabase.java:972) at org.apache.activemq.store.kahadb.data.KahaAddMessageCommand.visit(KahaAddMessageCommand.java:241) at org.apache.activemq.store.kahadb.MessageDatabase.process(MessageDatabase.java:969) at org.apache.activemq.store.kahadb.MessageDatabase.store(MessageDatabase.java:881) at org.apache.activemq.store.kahadb.MessageDatabase.store(MessageDatabase.java:863) at org.apache.activemq.store.kahadb.KahaDBStore$KahaDBMessageStore.addMessage(KahaDBStore.java:431) at org.apache.activemq.store.kahadb.KahaDBStore$StoreQueueTask.run(KahaDBStore.java:1177) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:680) 2013-03-01 16:05:06,743 WARN [server.AbstractHttpConnection] /v2/commands/ org.springframework.jms.UncategorizedJmsException: Uncategorized exception occured during JMS processing; nested exception is javax.jms.JMSException: java.io.EOFException at org.springframework.jms.support.JmsUtils.convertJmsAccessException(JmsUtils.java:316) at org.springframework.jms.support.JmsAccessor.convertJmsAccessException(JmsAccessor.java:168) at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:469) at org.springframework.jms.core.JmsTemplate.send(JmsTemplate.java:543) at org.springframework.jms.core.JmsTemplate.convertAndSend(JmsTemplate.java:653) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93) at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:28) at clamq.jms$jms_producer$reify__3138.publish(jms.clj:29) at clamq.jms$jms_producer$reify__3138.publish(jms.clj:30) at clamq.protocol.producer$eval3094$fn__3095$G__3086__3103.invoke(producer.clj:3) at clamq.protocol.producer$eval3094$fn__3095$G__3085__3112.invoke(producer.clj:3) at clojure.lang.AFn.applyToHelper(AFn.java:167) at clojure.lang.AFn.applyTo(AFn.java:151) at clojure.core$apply.invoke(core.clj:603) at com.puppetlabs.mq$connect_and_publish_BANG_.doInvoke(mq.clj:136) at clojure.lang.RestFn.invoke(RestFn.java:439) at com.puppetlabs.puppetdb.command$enqueue_raw_command_BANG_$fn__3754.invoke(command.clj:254) at com.puppetlabs.puppetdb.command$enqueue_raw_command_BANG_.invoke(command.clj:253) at com.puppetlabs.puppetdb.http.v1.command$enqueue_command.invoke(command.clj:22) at com.puppetlabs.middleware$verify_accepts_content_type$fn__4240.invoke(middleware.clj:67) at com.puppetlabs.middleware$verify_checksum$fn__4249.invoke(middleware.clj:102) at com.puppetlabs.middleware$verify_param_exists$fn__4244.invoke(middleware.clj:79) at com.puppetlabs.puppetdb.http.v1.command$command_app.invoke(command.clj:27) at com.puppetlabs.puppetdb.http.v2$v2_app$fn__6640.invoke(v2.clj:12) at net.cgrand.moustache$alter_request$fn__4450.invoke(moustache.clj:54) at com.puppetlabs.puppetdb.http.v2$v2_app.invoke(v2.clj:12) at com.puppetlabs.puppetdb.http.server$routes$fn__7042.invoke(server.clj:27) at net.cgrand.moustache$alter_request$fn__4450.invoke(moustache.clj:54) at com.puppetlabs.puppetdb.http.server$routes.invoke(server.clj:27) at ring.middleware.resource$wrap_resource$fn__7022.invoke(resource.clj:14) at ring.middleware.params$wrap_params$fn__4400.invoke(params.clj:55) at com.puppetlabs.middleware$wrap_with_authorization$fn__4226.invoke(middleware.clj:21) at com.puppetlabs.middleware$wrap_with_certificate_cn$fn__4230.invoke(middleware.clj:36) at com.puppetlabs.middleware$wrap_with_default_body$fn__4233.invoke(middleware.clj:43) at com.puppetlabs.middleware$wrap_with_metrics_STAR_$fn__4253$fn__4254.invoke(middleware.clj:119) at com.puppetlabs.middleware.proxy$java.lang.Object$Callable$f8c5758f.call(Unknown Source) at com.yammer.metrics.core.Timer.time(Timer.java:91) at com.puppetlabs.middleware$wrap_with_metrics_STAR_$fn__4253.invoke(middleware.clj:117) at com.puppetlabs.middleware$wrap_with_globals$fn__4236.invoke(middleware.clj:54) at ring.adapter.jetty$proxy_handler$fn__4071.invoke(jetty.clj:18) at ring.adapter.jetty.proxy$org.eclipse.jetty.server.handler.AbstractHandler$0.handle(Unknown Source) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111) at org.eclipse.jetty.server.Server.handle(Server.java:349) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:452) at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:894) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:948) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:857) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:76) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:609) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:45) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:599) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:534) at java.lang.Thread.run(Thread.java:680) Caused by: javax.jms.JMSException: java.io.EOFException at org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:49) at org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1362) at org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1290) at org.apache.activemq.ActiveMQSession.send(ActiveMQSession.java:1785) at org.apache.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:277) at org.apache.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:212) at org.apache.activemq.ActiveMQMessageProducerSupport.send(ActiveMQMessageProducerSupport.java:269) at org.springframework.jms.connection.CachedMessageProducer.send(CachedMessageProducer.java:117) at org.springframework.jms.core.JmsTemplate.doSend(JmsTemplate.java:592) at org.springframework.jms.core.JmsTemplate.doSend(JmsTemplate.java:569) at org.springframework.jms.core.JmsTemplate$4.doInJms(JmsTemplate.java:546) at org.springframework.jms.core.JmsTemplate.execute(JmsTemplate.java:466) ... 55 more Caused by: java.util.concurrent.ExecutionException: java.io.EOFException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.activemq.broker.region.Queue.doMessageSend(Queue.java:785) at org.apache.activemq.broker.region.Queue.send(Queue.java:707) at org.apache.activemq.broker.region.AbstractRegion.send(AbstractRegion.java:407) at org.apache.activemq.broker.region.RegionBroker.send(RegionBroker.java:503) at org.apache.activemq.broker.jmx.ManagedRegionBroker.send(ManagedRegionBroker.java:305) at org.apache.activemq.broker.BrokerFilter.send(BrokerFilter.java:129) at org.apache.activemq.broker.scheduler.SchedulerBroker.send(SchedulerBroker.java:189) at org.apache.activemq.broker.BrokerFilter.send(BrokerFilter.java:129) at org.apache.activemq.broker.CompositeDestinationBroker.send(CompositeDestinationBroker.java:96) at org.apache.activemq.broker.TransactionBroker.send(TransactionBroker.java:306) at org.apache.activemq.broker.MutableBrokerFilter.send(MutableBrokerFilter.java:135) at org.apache.activemq.broker.TransportConnection.processMessage(TransportConnection.java:453) at org.apache.activemq.command.ActiveMQMessage.visit(ActiveMQMessage.java:681) at org.apache.activemq.broker.TransportConnection.service(TransportConnection.java:292) at org.apache.activemq.broker.TransportConnection$1.onCommand(TransportConnection.java:150) at org.apache.activemq.transport.ResponseCorrelator.onCommand(ResponseCorrelator.java:116) at org.apache.activemq.transport.MutexTransport.onCommand(MutexTransport.java:50) at org.apache.activemq.transport.vm.VMTransport.iterate(VMTransport.java:231) at org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:122) at org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:43) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) ... 1 more Caused by: java.io.EOFException at java.io.RandomAccessFile.readFully(RandomAccessFile.java:399) at java.io.RandomAccessFile.readFully(RandomAccessFile.java:377) at org.apache.kahadb.page.PageFile.readPage(PageFile.java:867) at org.apache.kahadb.page.Transaction$2.readPage(Transaction.java:446) at org.apache.kahadb.page.Transaction$2.(Transaction.java:437) at org.apache.kahadb.page.Transaction.openInputStream(Transaction.java:434) at org.apache.kahadb.page.Transaction.load(Transaction.java:410) at org.apache.kahadb.page.Transaction.load(Transaction.java:367) at org.apache.kahadb.index.BTreeIndex.loadNode(BTreeIndex.java:262) at org.apache.kahadb.index.BTreeIndex.getRoot(BTreeIndex.java:174) at org.apache.kahadb.index.BTreeIndex.put(BTreeIndex.java:189) at org.apache.activemq.store.kahadb.MessageDatabase.upadateIndex(MessageDatabase.java:1197) at org.apache.activemq.store.kahadb.MessageDatabase$14.execute(MessageDatabase.java:1027) at org.apache.kahadb.page.Transaction.execute(Transaction.java:769) at org.apache.activemq.store.kahadb.MessageDatabase.process(MessageDatabase.java:1025) at org.apache.activemq.store.kahadb.MessageDatabase$13.visit(MessageDatabase.java:972) at org.apache.activemq.store.kahadb.data.KahaAddMessageCommand.visit(KahaAddMessageCommand.java:241) at org.apache.activemq.store.kahadb.MessageDatabase.process(MessageDatabase.java:969) at org.apache.activemq.store.kahadb.MessageDatabase.store(MessageDatabase.java:881) at org.apache.activemq.store.kahadb.MessageDatabase.store(MessageDatabase.java:863) at org.apache.activemq.store.kahadb.KahaDBStore$KahaDBMessageStore.addMessage(KahaDBStore.java:431) at org.apache.activemq.store.kahadb.KahaDBStore$StoreQueueTask.run(KahaDBStore.java:1177) ... 3 more {code} At the moment the only work-around I have is to blank out the db.data and let it get recreated from the journal. I just want to stress that the corruption wasn't caused by KahaDB, at least we haven't seen any cases. The corruptions were either caused by: disk filling up, or bad copies when migrating directories etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira