activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "SuoNayi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AMQ-4122) Lease Database Locker failover broken
Date Sat, 23 Feb 2013 11:14:14 GMT

    [ https://issues.apache.org/jira/browse/AMQ-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13585090#comment-13585090
] 

SuoNayi commented on AMQ-4122:
------------------------------

[~gtully]st.h uploaded the query log not me.
In fact I already know that lockAcquireSleepInterval should be greater than lockKeepAlivePeriod.
Their names are not intuitive indeed, new names seem very well.
I had tested your suggested configuration before you told me, there are two problems:
1. The duplicate hiding brokerService variable in JDBCPersistenceAdapter causes the presence
of multiple master brokers(solved by AMQ-4108).
2. The subsequent problem after problem 1 is that a new startup broker always succeeds in
obtaining the lease lock and becomes master and the previous master unexpectedly loses the
lock and terminates.No brokers can keep obtaining the lease lock all the time.
The cause is that LeaseDatabaseLocker always succeed updating the broker name (the owner of
the lease lock) by later lease time in contrast to the current lease owner.
I have tested my patch and it solves the problems.
Can you review and verify it? 
                
> Lease Database Locker failover broken
> -------------------------------------
>
>                 Key: AMQ-4122
>                 URL: https://issues.apache.org/jira/browse/AMQ-4122
>             Project: ActiveMQ
>          Issue Type: Bug
>    Affects Versions: 5.7.0
>         Environment: Java 7u9, SUSE 11, Mysql
>            Reporter: st.h
>            Assignee: Gary Tully
>             Fix For: 5.8.0
>
>         Attachments: activemq-kyle.xml, activemq.xml, activemq.xml, AMQ4122.patch, mysql.log
>
>
> We are using ActiveMQ 5.7.0 together with a mysql database and could not observe correct
failover behavior with lease database locker.
> It seems that there is a race condition, which prevents the correct failover procedure.
> We noticed that when starting up two instances, both instance are becoming master.
> We did several test, including the following and could not observe intended functionality:
> - shutdown all instances
> - manipulate database lock that one node has lock and set expiry time in distance future
> - start up both instances. both instances are unable to acquire lock, as the lock hasn't
expired, which should be correct behavior.
> - update the expiry time in database, so that the lock is expired.
> - first instance notices expired lock and becomes master
> - when second instance checks for lock, it also updates the database and becomes master.
> To my understanding the second instance should not be able to update the lock, as it
is held by the first instance and should not be able to become master.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message