cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dean Close <dean.cl...@icloudhosting.com>
Subject Re: Redundant Router Interfaces
Date Fri, 01 Apr 2016 22:22:04 GMT
Hi Wei,

It looks like your patch will prevent redundant routers from bringing up public interfaces.
We also need to:

1. shift the VRRP instance over to eth0 as eth2 will be down on the backup.
2. Ensure that eth0 and eth1 are up on both redundant virtual routers.

Your patch might already solve point 2. What do you think?

Point 1 will require modifying the keepalived.conf template. Do you want to handle this in
your PR or shall I arrange that?

Kind Regards,

Dean Close
iCloudHosting.com
http://www.icloudhosting.com
Tel: 01582 227927

Spectrum House, Dunstable Road, Redbourn, AL3 7PR


******************************************************************

The names iCloudHosting and iCloudHosting.com are trading styles of BBS Commerce Ltd which
is registered in England and Wales, Company Number 04837714. Please use our trading address
above for mail. Our registered office is 5 Theale Lakes Business Park, Moulden Way, Sulhamstead,
Reading, Berkshire, RG7 4GB. VAT Registration Number GB 982 8230 94.

This email and any files transmitted with it are confidential and intended solely for the
use of the individual or entity to whom they are addressed. If you are not the intended recipient
you are not authorised to and must not disclose, copy, distribute, or retain this message
or any part of it.

iCloudHosting accepts no responsibility for information, errors or omissions in this email.

******************************************************************




On Fri, Apr 1, 2016 at 8:34 pm, Wei ZHOU <ustcweizhou@gmail.com> wrote:
actually not only this issue, but more
(1) restart vpc router will break the network if there are multiple tiers
in the vpc.
(2) check_heartbeat.sh does not work.

I will create tickets for them next Monday, and create PR on github.

-Wei

2016-04-01 21:14 GMT+02:00 Wei ZHOU <ustcweizhou@gmail.com>:

> Dean,
>
> I just fixed it yesterday.
>
> the commit is
>
> ---
>
> diff --git a/systemvm/patches/debian/config/opt/cloud/bin/cs/CsAddress.py
> b/systemvm/patches/debian/config/opt/cloud/bin/cs/CsAddress.py
> index 5f63c06..5256d03 100755
> --- a/systemvm/patches/debian/config/opt/cloud/bin/cs/CsAddress.py
> +++ b/systemvm/patches/debian/config/opt/cloud/bin/cs/CsAddress.py
> @@ -27,7 +27,7 @@ from CsRoute import CsRoute
> from CsRule import CsRule
>
> VRRP_TYPES = ['guest']
> -PUBLIC_INTERFACE = ['eth1']
> +VPC_PUBLIC_INTERFACE = ['eth1']
>
> class CsAddress(CsDataBag):
>
> @@ -323,7 +323,7 @@ class CsIP:
> # If redundant only bring up public interfaces that are
> not eth1.
> # Reason: private gateways are public interfaces.
> # master.py and keepalived will deal with eth1 public
> interface.
> - if self.cl.is_redundant() and (not self.is_public() or
> self.getDevice() not in PUBLIC_INTERFACE):
> + if self.cl.is_redundant() and (not self.is_public() or
> (self.config.is_vpc() and self.getDevice() not in VPC_PUBLIC_INTERFACE)):
> CsHelper.execute(cmd2)
> # if not redundant bring everything up
> if not self.cl.is_redundant():
>
> ---
>
> - Wei
>
> 2016-04-01 20:13 GMT+02:00 Dean Close <dean.close@icloudhosting.com>:
>
>> Hi guys,
>>
>> I had been investigating a possible bug with the way interfaces are
>> managed on virtual routers. The public interfaces are being brought up on
>> backup routers and (because they boot second) they arp the IPs away from
>> the master. I'd been examining an idea for a fix but whilst doing so I
>> found that the system appears to be designed to bring up these interfaces.
>>
>> I suspect that a few things need to be reworked - but the changes
>> necessary go so far against what has been implemented that I wanted to open
>> this up before doing the work.
>>
>> Hopefully if I go through my findings you guys can help me see what I
>> might be getting wrong.
>>
>> The following was correct for pre-4.6 redundant routers:
>>
>> 1. Both routers get configured with IP addresses, routes and iptables
>> rules.
>> 2. Public interfaces are initially set as DOWN.
>> 3. Keepalived runs a VRRP instance on the private interface (eth0) to
>> negotiate MASTER/BACKUP roles.
>> 4. Keepalived manages the virtual IP on eth0 used as the public gateway
>> for the guest VMs.
>> 5. Keepalived uses a master notify script to bring up the public
>> interfaces.
>>
>> The above was true for pre-4.6 routers. Now, however, things appear to
>> work differently:
>>
>> 1. Both routers get configured as before.
>> 2. All interfaces apart from eth1 (the Hypervisor-link interface) are
>> set as UP.
>> 3. Keepalived runs a VRRP instance on the first public interface (eth2)
>> to negotiate MASTER/BACKUP roles.
>> 4. Keepalived manages the virtual IP as before.
>> 5. Keepalived uses a master notify script to bring up the public
>> interfaces (unnecessary)
>> 6. Keepalived uses a backup notify script to bring down the public
>> interfaces (unused)
>>
>> This is unexpected for the following reasons:
>>
>> 1. The keepalived notify script brings the public interfaces down when
>> transitioning to BACKUP - so how can we expect to run a VRRP instance over
>> eth2?
>> 2. If interfaces are down when transitioning to BACKUP, why are they
>> not expected to be down to begin with? (Before the router has become MASTER)
>> 3. Why are we running a VRRP instance over an interface with an IP that
>> will clash with another host on the network?
>>
>> The following method from the CsIP class in
>> /opt/cloud/bin/cs/CsAddress.py confuses matters futher:
>>
>> def check_is_up(self):
>> """ Ensure device is up """
>> cmd = "ip link show %s | grep 'state DOWN'" % self.getDevice()
>> for i in CsHelper.execute(cmd):
>> if " DOWN " in i:
>> cmd2 = "ip link set %s up" % self.getDevice()
>> # If redundant only bring up public interfaces that are
>> not eth1.
>> # Reason: private gateways are public interfaces.
>> # master.py and keepalived will deal with eth1 public
>> interface.
>> if self.cl.is_redundant() and (not self.is_public() or
>> self.getDevice() not in PUBLIC_INTERFACE):
>> CsHelper.execute(cmd2)
>> # if not redundant bring everything up
>> if not self.cl.is_redundant():
>> CsHelper.execute(cmd2)
>>
>> The comments refer to eth1 as a public interface when this is the link to
>> the hypervisor. Indeed, PUBLIC_INTERFACE is defined on line 31 as ['eth1'].
>> But keepalived and master.py don't influence eth1 at all. This looks like a
>> mistake.
>>
>> Lastly, the logic of this line looks flawed:
>>
>> if self.cl.is_redundant() and (not self.is_public() or self.getDevice()
>> not in PUBLIC_INTERFACE)
>>
>> As PUBLIC_INTERFACE is limited to eth1, the `not self.is_public()` will
>> be ignored. Public IPs will never be assigned to eth1, so this line
>> evaluates as:
>>
>>
>> if self.cl.is_redundant() and (self.getDevice() not in PUBLIC_INTERFACE)
>>
>> which reduces even further to:
>>
>> if self.cs.is_redundant() and self.is_control()
>>
>>
>> What would need doing
>> ---------------------
>>
>> 1. The keepalived.conf template would need to be changed to run the
>> VRRP instance over eth0.
>> 2. The check_is_up method of the CsIP class should be renamed to
>> 'bring_up_interfaces'. For redundant routers it should ignore IPs that pass
>> is_public or needs_vrrp.
>> 3. The arpPing method should do nothing if the interface is down.
>> 4. The PUBLIC_INTERFACE constant should be either renamed or dropped
>> altogether.
>> 5. Other things that I haven't considered?
>>
>>
>> I'd really appreciate any feedback on this. It's possible that I've got
>> it all wrong but I'm suspecting not. I just don't want to tread on anyone's
>> toes by submitting a PR that goes against what appears to be an explicit
>> design decision.
>>
>>
>> Kind regards,
>>
>> Dean Close
>> iCloudHosting.com
>> http://www.icloudhosting.com
>> Tel: 01582 227927
>>
>> Unit 2, Smallmead Road, Reading RG2 0QS
>>
>> ******************************************************************
>> The names iCloudHosting and iCloudHosting.com are trading styles of BBS
>> Commerce Ltd which is registered in England and Wales, Company Number
>> 04837714. Please use our trading address above for mail. Our registered
>> office is 5 Theale Lakes Business Park, Moulden Way, Sulhamstead, Reading,
>> Berkshire, RG7 4GB. VAT Registration Number GB 982 8230 94.
>>
>> This email and any files transmitted with it are confidential and
>> intended solely for the use of the individual or entity to whom they are
>> addressed. If you are not the intended recipient you are not authorised to
>> and must not disclose, copy, distribute, or retain this message or any part
>> of it.
>>
>> iCloudHosting accepts no responsibility for information, errors or
>> omissions in this email.
>> ******************************************************************
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message