activemq-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gabriel Nieves (JIRA)" <j...@apache.org>
Subject [jira] [Created] (AMQ-5899) Unable to recover after going below viable H/A master/slave (Unkown data type
Date Thu, 23 Jul 2015 17:04:04 GMT
Gabriel Nieves created AMQ-5899:
-----------------------------------

             Summary: Unable to recover after going below viable H/A master/slave (Unkown
data type
                 Key: AMQ-5899
                 URL: https://issues.apache.org/jira/browse/AMQ-5899
             Project: ActiveMQ
          Issue Type: Bug
          Components: activemq-leveldb-store
    Affects Versions: 5.10.0
         Environment: 3 CentOS Servers running ActiveMQ (5.10.0 or 5.11.0), each connect in
different sites. Using H/A master slave concept. SSL enabled. using levelDB. Using Zookeeper.

            Reporter: Gabriel Nieves


I have 3 servers running ActiveMQ in High availability mode. Lets call these server A, B and
C, and lets say A is master. if you stop 2 servers, A and B, while at the same time you are
send messages to server A, everything will go down; which makes since. Now if you start up
you start up A and B, you will get an Unknown data type and a javax.IOException or a null
pointer excepting after a master has been selected and a slave has attached.

I suspect this is caused mainly because during the time these servers stopped replication
was occurring, thus "corrupting" the levelDB. I say "corrupting", however there has been cases
were I only started one up after going below viable and everything worked fine, so this could
be caused by a synchronization issue with levelDB replication.

After I get this Unknown data type error which value changes every time I replicated this
issue,  the master server will restart. This happens many times and eventually the ActiveMQ
process dies.

So far to get these server up an running again I need to clear the activemq-data folder where
all the replication logs are located. This is not an acceptable solution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message