cloudstack-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CLOUDSTACK-8952) The redundant routers are facing a race condition due to several KeepaliveD/ConntrackD restarts
Date Tue, 20 Oct 2015 06:02:28 GMT

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-8952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14964607#comment-14964607
] 

ASF subversion and git services commented on CLOUDSTACK-8952:
-------------------------------------------------------------

Commit 6fe5ae0d609592c790848aa4249803904deb49cf in cloudstack's branch refs/heads/master from
[~remibergsma]
[ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=6fe5ae0 ]

Merge pull request #940 from ekholabs/fix/rvr__keepalived_restart

CLOUDSTACK-8952 - The redundant routers are facing a race condition due to several KeepaliveD/ConntrackD
restartsThis PR fixes the following issues:

* KeepAliveD being restarted for each action performed on the routers
* ConntrackD configuration being copied for each action performed on the routers, causing
several restarts
* ACS Management Server relying in the JSON file to report which router is Master/Backup
* Public Interface on both routers are in UP state due to several places checking if the interface
is UP/DOWN and trying to do KeepAliveD
* Removing all the sleeps from the test_vpc_redundant.py - those are no longer needed
* When KeepAliveD calls master.py during the election, update the cmdline.json to set the
router in Backup mode: the election will take care of changing it afterwards.
* Add LB stats_rules to iptables INPUT chain
* The RVR public interface is set to eth2 instead of eth1 - as in the rVPC. Make sure the
check works in both cases

Those fixes make all the routers very stable, with ACL, FW, PF and LB working just fine!

* pr/940:
  CLOUDSTACK-8952 - Make the checkrouter.sh compatible with RVR as well
  CLOUDSTACK-8952 - Make the tests rely on the interface state other than the json file
  CLOUDSTACK-8952 - Reduce retried from 20 to 5
  CLOUDSTACK-8952 - Do not rely in the router state on the json file to report back to ACS
  CLOUDSTACK-8952 - Make the check for master more reliable
  CLOUDSTACK-8952 - Restart dnsmasq everytime the configure.py runs
  CLOUDSTACK-8952 - Make sure the calls to CsFile use the new logic of commit/is_changed methods
  CLOUDSTACK-8952 - Make sure we restart dnsmasq if the configuration file changes
  CLOUDSTACK-8952 - The public interface was comming UP in the Backup router
  CLOUDSTACK-8952 - Do not restart conntrackd unless it's needed
  CLOUDSTACK-8952 - Do not replace the conntrackd config file unless it's needed
  CLOUDSTACK-8952 - Remove the '--vrrp' search criteria form the CsProcess constructor call

Signed-off-by: Remi Bergsma <github@remi.nl>


> The redundant routers are facing a race condition due to several KeepaliveD/ConntrackD
restarts
> -----------------------------------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-8952
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-8952
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: Virtual Router
>    Affects Versions: 4.6.0
>            Reporter: Wilder Rodrigues
>            Assignee: Wilder Rodrigues
>            Priority: Blocker
>             Fix For: 4.6.0
>
>
> In the CsRedundant.py we have a line doing:
> proc = CsProcess(['/usr/sbin/keepalived', '--vrrp'])
> However, the CsProcess cannot find a process with the string search "--vrrp", which makes
it always return false and restart keepalived.
> Due to the restart, the routers start a race condition to become master, which makes
network features unavailable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message