activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ed Schmed (JIRA)" <j...@apache.org>
Subject [jira] [Created] (AMQ-5181) Replicated LevelDB Corruption on Solaris
Date Mon, 12 May 2014 13:40:14 GMT
Ed Schmed created AMQ-5181:
------------------------------

             Summary: Replicated LevelDB Corruption on Solaris
                 Key: AMQ-5181
                 URL: https://issues.apache.org/jira/browse/AMQ-5181
             Project: ActiveMQ
          Issue Type: Bug
          Components: activemq-leveldb-store
    Affects Versions: 5.9.1, 5.10.0
         Environment: Solaris 5.10 on Sparc
            Reporter: Ed Schmed


Steps to recreate:

3 Node ActiveMQ cluster using replicated leveldb, AMQ 5.9.1

Start all three instances

Using the web console, connect to the master and create a queue named test. Also using the
web console, send 100 persistent messages with priority 4 to the queue.

Issue kill command against the PID for the master broker

When another broker tries to become master, CorruptionExceptions are thrown:

2014-05-12 09:30:22,910 | INFO  | No IOExceptionHandler registered, ignoring IO exception
| org.apache.activemq.broker.BrokerService | LevelDB IOException handler.
java.io.IOException: org.iq80.snappy.CorruptionException: Invalid copy offset for opcode starting
at 8
        at org.apache.activemq.util.IOExceptionSupport.create(IOExceptionSupport.java:39)
        at org.apache.activemq.leveldb.LevelDBClient.might_fail(LevelDBClient.scala:552)
        at org.apache.activemq.leveldb.LevelDBClient.replay_init(LevelDBClient.scala:657)
        at org.apache.activemq.leveldb.LevelDBClient.start(LevelDBClient.scala:558)
        at org.apache.activemq.leveldb.DBManager.start(DBManager.scala:626)
        at org.apache.activemq.leveldb.LevelDBStore.doStart(LevelDBStore.scala:236)
        at org.apache.activemq.leveldb.replicated.MasterLevelDBStore.doStart(MasterLevelDBStore.scala:110)
        at org.apache.activemq.util.ServiceSupport.start(ServiceSupport.java:55)
        at org.apache.activemq.leveldb.replicated.ElectingLevelDBStore$$anonfun$start_master$1.apply$mcV$sp(ElectingLevelDBStore.scala:226)
        at org.fusesource.hawtdispatch.package$$anon$4.run(hawtdispatch.scala:357)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
Caused by: com.google.common.util.concurrent.UncheckedExecutionException: org.iq80.snappy.CorruptionException:
Invalid copy offset for opcode starting at 8
        at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2256)
        at com.google.common.cache.LocalCache.get(LocalCache.java:3980)
        at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3984)
        at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4868)
        at org.iq80.leveldb.impl.TableCache.getTable(TableCache.java:80)
        at org.iq80.leveldb.impl.TableCache.newIterator(TableCache.java:69)
        at org.iq80.leveldb.impl.TableCache.newIterator(TableCache.java:64)
        at org.iq80.leveldb.impl.DbImpl.buildTable(DbImpl.java:983)
        at org.iq80.leveldb.impl.DbImpl.writeLevel0Table(DbImpl.java:932)
        at org.iq80.leveldb.impl.DbImpl.recoverLogFile(DbImpl.java:552)
        at org.iq80.leveldb.impl.DbImpl.<init>(DbImpl.java:209)
        at org.iq80.leveldb.impl.Iq80DBFactory.open(Iq80DBFactory.java:59)
        at org.apache.activemq.leveldb.LevelDBClient$$anonfun$replay_init$2.apply$mcV$sp(LevelDBClient.scala:677)
        at org.apache.activemq.leveldb.LevelDBClient$$anonfun$replay_init$2.apply(LevelDBClient.scala:657)
        at org.apache.activemq.leveldb.LevelDBClient$$anonfun$replay_init$2.apply(LevelDBClient.scala:657)
        at org.apache.activemq.leveldb.LevelDBClient.might_fail(LevelDBClient.scala:549)
        ... 11 more
Caused by: org.iq80.snappy.CorruptionException: Invalid copy offset for opcode starting at
8
        at org.iq80.snappy.SnappyDecompressor.decompressAllTags(SnappyDecompressor.java:165)
        at org.iq80.snappy.SnappyDecompressor.uncompress(SnappyDecompressor.java:76)
        at org.iq80.snappy.Snappy.uncompress(Snappy.java:43)
        at org.iq80.leveldb.util.Snappy$IQ80Snappy.uncompress(Snappy.java:100)
        at org.iq80.leveldb.util.Snappy.uncompress(Snappy.java:160)
        at org.iq80.leveldb.table.FileChannelTable.readBlock(FileChannelTable.java:74)
        at org.iq80.leveldb.table.Table.<init>(Table.java:60)
        at org.iq80.leveldb.table.FileChannelTable.<init>(FileChannelTable.java:34)
        at org.iq80.leveldb.impl.TableCache$TableAndFile.<init>(TableCache.java:117)
        at org.iq80.leveldb.impl.TableCache$TableAndFile.<init>(TableCache.java:102)
        at org.iq80.leveldb.impl.TableCache$1.load(TableCache.java:57)
        at org.iq80.leveldb.impl.TableCache$1.load(TableCache.java:54)
        at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3579)
        at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2372)
        at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2335)
        at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2250)
        ... 26 more

Only way to recover is to delete the data dir on the brokers and start again.

I downloaded and setup 5.10.0-SNAPSHOT 20140506.233923-67

However, I have a different problem with this version.  Start one instance and it waits for
a second instance so it can select a master, start second instance and first instance then
throws this error every time:

2014-05-12 09:35:56,929 | INFO  | No IOExceptionHandler registered, ignoring IO exception
| org.apache.activemq.broker.BrokerService | LevelDB IOException handler.
java.io.IOException: com.google.common.base.Objects.firstNonNull(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
        at org.apache.activemq.util.IOExceptionSupport.create(IOExceptionSupport.java:39)[activemq-client-5.10-SNAPSHOT.jar:5.10-SNAPSHOT]
        at org.apache.activemq.leveldb.LevelDBClient.might_fail(LevelDBClient.scala:552)[activemq-leveldb-store-5.10-SNAPSHOT.jar:5.10-SNAPSHOT]
        at org.apache.activemq.leveldb.LevelDBClient.replay_init(LevelDBClient.scala:657)[activemq-leveldb-store-5.10-SNAPSHOT.jar:5.10-SNAPSHOT]
        at org.apache.activemq.leveldb.LevelDBClient.start(LevelDBClient.scala:558)[activemq-leveldb-store-5.10-SNAPSHOT.jar:5.10-SNAPSHOT]
        at org.apache.activemq.leveldb.DBManager.start(DBManager.scala:648)[activemq-leveldb-store-5.10-SNAPSHOT.jar:5.10-SNAPSHOT]
        at org.apache.activemq.leveldb.LevelDBStore.doStart(LevelDBStore.scala:235)[activemq-leveldb-store-5.10-SNAPSHOT.jar:5.10-SNAPSHOT]
        at org.apache.activemq.leveldb.replicated.MasterLevelDBStore.doStart(MasterLevelDBStore.scala:110)[activemq-leveldb-store-5.10-SNAPSHOT.jar:5.10-SNAPSHOT]
        at org.apache.activemq.util.ServiceSupport.start(ServiceSupport.java:55)[activemq-client-5.10-SNAPSHOT.jar:5.10-SNAPSHOT]
        at org.apache.activemq.leveldb.replicated.ElectingLevelDBStore$$anonfun$start_master$1.apply$mcV$sp(ElectingLevelDBStore.scala:226)[activemq-leveldb-store-5.10-SNAPSHOT.jar:5.10-SNAPSHOT]
        at org.fusesource.hawtdispatch.package$$anon$4.run(hawtdispatch.scala:357)[hawtdispatch-scala-1.20.jar:1.20]
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)[:1.6.0_26]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)[:1.6.0_26]
        at java.lang.Thread.run(Thread.java:662)[:1.6.0_26]

Maybe this belongs in a separate ticket?




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message