Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 01551200B8D for ; Fri, 23 Sep 2016 13:44:40 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id EF51A160ACA; Fri, 23 Sep 2016 11:44:40 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 3FA09160AC2 for ; Fri, 23 Sep 2016 13:44:40 +0200 (CEST) Received: (qmail 45871 invoked by uid 500); 23 Sep 2016 11:44:39 -0000 Mailing-List: contact users-help@activemq.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@activemq.apache.org Delivered-To: mailing list users@activemq.apache.org Received: (qmail 45860 invoked by uid 99); 23 Sep 2016 11:44:39 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 23 Sep 2016 11:44:39 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 892E91A7A92 for ; Fri, 23 Sep 2016 11:44:38 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.349 X-Spam-Level: ** X-Spam-Status: No, score=2.349 tagged_above=-999 required=6.31 tests=[RCVD_IN_DNSWL_NONE=-0.0001, SPF_SOFTFAIL=0.972, URI_HEX=1.313, URI_TRY_3LD=0.064] autolearn=disabled Received: from mx2-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id F1OoMdC9e5bM for ; Fri, 23 Sep 2016 11:44:35 +0000 (UTC) Received: from mwork.nabble.com (mwork.nabble.com [162.253.133.43]) by mx2-lw-eu.apache.org (ASF Mail Server at mx2-lw-eu.apache.org) with ESMTP id 62FAE5F5A1 for ; Fri, 23 Sep 2016 11:44:34 +0000 (UTC) Received: from mjoe.nabble.com (unknown [162.253.133.57]) by mwork.nabble.com (Postfix) with ESMTP id E855E511417D3 for ; Fri, 23 Sep 2016 04:44:27 -0700 (MST) Date: Fri, 23 Sep 2016 04:39:33 -0700 (PDT) From: mlange To: users@activemq.apache.org Message-ID: <1474630773651-4716831.post@n4.nabble.com> Subject: ActiveMQ ReplicatedLevelDB corruption MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit archived-at: Fri, 23 Sep 2016 11:44:41 -0000 Recently I installed Apache ActiveMQ in a few different ways. One of those is using ReplicatedLevelDB for a master/slave/slave setup. Yesterday I did a bit of loadtesting: sending 100.000 messages with 100 threads producing the messages (used jmeter for that) (so each thread produced 1000 messages); I had another process moving the messages from one broker to another and back again (the queues had the same names across each broker, so that was easy moving) and then went about processing the messages which caused the messages to flow across various queues. All seemed fine, everything looked okay... until I stopped the active broker. (this is hours after the last message was consumed and procsesed): Then I notice a few bouncing brokers, one comes up but crashes on an EOFException; a bit later the other broker does the same. In the log I see many messages like this: [quote] 2016-09-23 13:38:52,950 | WARN | No reader available for position: 0, log_infos: {11534500540=LogInfo(/data/activemq/broker1-db/00000002af8282bc.log,11534500540,104858130), 12163654223=LogInfo(/data/activemq/broker1-db/00000002d502a24f.log,12163654223,104858162), 12897666570=LogInfo(/data/activemq/broker1-db/0000000300c2c60a.log,12897666570,104859912), 13002526482=LogInfo(/data/activemq/broker1-db/000000030702cf12.log,13002526482,104859038), 18455209795=LogInfo(/data/activemq/broker1-db/000000044c042743.log,18455209795,104859837), 22020442500=LogInfo(/data/activemq/broker1-db/0000000520854984.log,22020442500,104859288), 23173898306=LogInfo(/data/activemq/broker1-db/000000056545a042.log,23173898306,104860684), 24641928389=LogInfo(/data/activemq/broker1-db/00000005bcc5fcc5.log,24641928389,0)} | org.apache.activemq.leveldb.RecordLog | Thread-1039 [/quote] Then I see messages like this: [quote] 2016-09-23 13:38:46,324 | WARN | Invalid log position: 11409726550 | org.apache.activemq.leveldb.LevelDBClient | ActiveMQ BrokerService[broker1] Task-3 [/quote] After that, the broker starts and logs a few messages like this: [quote] 2016-09-23 13:40:49,041 | WARN | Invalid log position: 0 | org.apache.activemq.leveldb.LevelDBClient | Thread-1040 [/quote] and then we get exception: [quote] 2016-09-23 13:41:09,748 | INFO | Stopping BrokerService[broker1] due to exception, java.io.EOFException: File '/data/activemq/broker1-db/000000030702cf12.log' offset: 110647192 | org.apache.activemq.util.DefaultIOExceptionHandler | LevelDB IOException handler. java.io.EOFException: File '/data/activemq/broker1-db/000000030702cf12.log' offset: 110647192 [/quote] This is a rince and repeat situation; both living brokers are now alternating this sequence. It seems like the load I generated caused corruption on the database; but this should not be possible... What information can I provide to see how this situation can be avoided? -- View this message in context: http://activemq.2283324.n4.nabble.com/ActiveMQ-ReplicatedLevelDB-corruption-tp4716831.html Sent from the ActiveMQ - User mailing list archive at Nabble.com.