cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wei ZHOU <ustcweiz...@gmail.com>
Subject Re: Redundant Router Interfaces
Date Fri, 01 Apr 2016 19:14:19 GMT
Dean,

I just fixed it yesterday.

the commit is

---

diff --git a/systemvm/patches/debian/config/opt/cloud/bin/cs/CsAddress.py
b/systemvm/patches/debian/config/opt/cloud/bin/cs/CsAddress.py
index 5f63c06..5256d03 100755
--- a/systemvm/patches/debian/config/opt/cloud/bin/cs/CsAddress.py
+++ b/systemvm/patches/debian/config/opt/cloud/bin/cs/CsAddress.py
@@ -27,7 +27,7 @@ from CsRoute import CsRoute
 from CsRule import CsRule

 VRRP_TYPES = ['guest']
-PUBLIC_INTERFACE = ['eth1']
+VPC_PUBLIC_INTERFACE = ['eth1']

 class CsAddress(CsDataBag):

@@ -323,7 +323,7 @@ class CsIP:
                 # If redundant only bring up public interfaces that are
not eth1.
                 # Reason: private gateways are public interfaces.
                 # master.py and keepalived will deal with eth1 public
interface.
-                if self.cl.is_redundant() and (not self.is_public() or
self.getDevice() not in PUBLIC_INTERFACE):
+                if self.cl.is_redundant() and (not self.is_public() or
(self.config.is_vpc() and self.getDevice() not in VPC_PUBLIC_INTERFACE)):
                     CsHelper.execute(cmd2)
                 # if not redundant bring everything up
                 if not self.cl.is_redundant():

---

- Wei

2016-04-01 20:13 GMT+02:00 Dean Close <dean.close@icloudhosting.com>:

> Hi guys,
>
> I had been investigating a possible bug with the way interfaces are
> managed on virtual routers. The public interfaces are being brought up on
> backup routers and (because they boot second) they arp the IPs away from
> the master. I'd been examining an idea for a fix but whilst doing so I
> found that the system appears to be designed to bring up these interfaces.
>
> I suspect that a few things need to be reworked - but the changes
> necessary go so far against what has been implemented that I wanted to open
> this up before doing the work.
>
> Hopefully if I go through my findings you guys can help me see what I
> might be getting wrong.
>
> The following was correct for pre-4.6 redundant routers:
>
>   1. Both routers get configured with IP addresses, routes and iptables
> rules.
>   2. Public interfaces are initially set as DOWN.
>   3. Keepalived runs a VRRP instance on the private interface (eth0) to
> negotiate MASTER/BACKUP roles.
>   4. Keepalived manages the virtual IP on eth0 used as the public gateway
> for the guest VMs.
>   5. Keepalived uses a master notify script to bring up the public
> interfaces.
>
> The above was true for pre-4.6 routers. Now, however, things appear to
> work differently:
>
>   1. Both routers get configured as before.
>   2. All interfaces apart from eth1 (the Hypervisor-link interface) are
> set as UP.
>   3. Keepalived runs a VRRP instance on the first public interface (eth2)
> to negotiate MASTER/BACKUP roles.
>   4. Keepalived manages the virtual IP as before.
>   5. Keepalived uses a master notify script to bring up the public
> interfaces (unnecessary)
>   6. Keepalived uses a backup notify script to bring down the public
> interfaces (unused)
>
> This is unexpected for the following reasons:
>
>   1. The keepalived notify script brings the public interfaces down when
> transitioning to BACKUP - so how can we expect to run a VRRP instance over
> eth2?
>   2. If interfaces are down when transitioning to BACKUP, why are they not
> expected to be down to begin with? (Before the router has become MASTER)
>   3. Why are we running a VRRP instance over an interface with an IP that
> will clash with another host on the network?
>
> The following method from the CsIP class in /opt/cloud/bin/cs/CsAddress.py
> confuses matters futher:
>
>     def check_is_up(self):
>          """ Ensure device is up """
>          cmd = "ip link show %s | grep 'state DOWN'" % self.getDevice()
>          for i in CsHelper.execute(cmd):
>              if " DOWN " in i:
>                  cmd2 = "ip link set %s up" % self.getDevice()
>                  # If redundant only bring up public interfaces that are
> not eth1.
>                  # Reason: private gateways are public interfaces.
>                  # master.py and keepalived will deal with eth1 public
> interface.
>                  if self.cl.is_redundant() and (not self.is_public() or
> self.getDevice() not in PUBLIC_INTERFACE):
>                      CsHelper.execute(cmd2)
>                  # if not redundant bring everything up
>                  if not self.cl.is_redundant():
>                      CsHelper.execute(cmd2)
>
> The comments refer to eth1 as a public interface when this is the link to
> the hypervisor. Indeed, PUBLIC_INTERFACE is defined on line 31 as ['eth1'].
> But keepalived and master.py don't influence eth1 at all. This looks like a
> mistake.
>
> Lastly, the logic of this line looks flawed:
>
>   if self.cl.is_redundant() and (not self.is_public() or self.getDevice()
> not in PUBLIC_INTERFACE)
>
> As PUBLIC_INTERFACE is limited to eth1, the `not self.is_public()` will be
> ignored. Public IPs will never be assigned to eth1, so this line evaluates
> as:
>
>
>   if self.cl.is_redundant() and (self.getDevice() not in PUBLIC_INTERFACE)
>
> which reduces even further to:
>
>   if self.cs.is_redundant() and self.is_control()
>
>
> What would need doing
> ---------------------
>
>   1. The keepalived.conf template would need to be changed to run the VRRP
> instance over eth0.
>   2. The check_is_up method of the CsIP class should be renamed to
> 'bring_up_interfaces'. For redundant routers it should ignore IPs that pass
> is_public or needs_vrrp.
>   3. The arpPing method should do nothing if the interface is down.
>   4. The PUBLIC_INTERFACE constant should be either renamed or dropped
> altogether.
>   5. Other things that I haven't considered?
>
>
> I'd really appreciate any feedback on this. It's possible that I've got it
> all wrong but I'm suspecting not. I just don't want to tread on anyone's
> toes by submitting a PR that goes against what appears to be an explicit
> design decision.
>
>
> Kind regards,
>
> Dean Close
> iCloudHosting.com
> http://www.icloudhosting.com
> Tel: 01582 227927
>
> Unit 2, Smallmead Road, Reading RG2 0QS
>
> ******************************************************************
> The names iCloudHosting and iCloudHosting.com are trading styles of BBS
> Commerce Ltd which is registered in England and Wales, Company Number
> 04837714. Please use our trading address above for mail. Our registered
> office is 5 Theale Lakes Business Park, Moulden Way, Sulhamstead, Reading,
> Berkshire, RG7 4GB. VAT Registration Number GB 982 8230 94.
>
> This email and any files transmitted with it are confidential and intended
> solely for the use of the individual or entity to whom they are addressed.
> If you are not the intended recipient you are not authorised to and must
> not disclose, copy, distribute, or retain this message or any part of it.
>
> iCloudHosting accepts no responsibility for information, errors or
> omissions in this email.
> ******************************************************************
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message