cloudstack-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CLOUDSTACK-9154) rVPC doesn't recover from cleaning up of network garbage collector
Date Mon, 18 Jan 2016 11:14:41 GMT

    [ https://issues.apache.org/jira/browse/CLOUDSTACK-9154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15105137#comment-15105137
] 

ASF subversion and git services commented on CLOUDSTACK-9154:
-------------------------------------------------------------

Commit ff89587fd119b1cad543d8e96f0c428e41c35840 in cloudstack's branch refs/heads/master from
[~remibergsma]
[ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=ff89587 ]

Merge pull request #1277 from ekholabs/fix/4.7-rvpc-net-gc-CLOUDSTACK-9154

[4.7] Critical VPCVR issues fixed: CLOUDSTACK-9154; CLOUDSTACK-9187; and CLOUDSTACK-9188This
PR applies the same fixes as in the PR #1259, but against branch 4.7.

Please refer to PR #1259 for the tests results and all the comments already made there.

Issues fixed are:

* CLOUDSTACK-9154: rVPC doesn't recover from cleaning up of network garbage collector
* CLOUDSTACK-9187: rVPC routers in Master/Master due to concurrency problem when writing the
keepalivd.conf
* CLOUDSTACK-9188: NetworkGarbageCollector is not using gc.interval and gc.wait from settings

Those changes have been covered by 2 new tests added to ```smoke/test_vpc_redundant.py```:

* test_04_rvpc_network_garbage_collector_nics
* test_05_rvpc_multi_tiers

The test ```test_04_rvpc_network_garbage_collector_nics``` depends on the global settings
for the network.gc.interval and gc.wait. If one wants the test to run quicker, please change
the settings (default is 600 seconds for each) and restart the Management Server before running
the tests. I would suggest to set it to 60 seconds.

In addition, the NetworkGarbageCollector was redefining the settings above mentioned and not
reading their values through ConfigDao. Due to that, the settings were not being applied properly
and the test was waiting to long to check the VPC routers.

* pr/1277:
  CLOUDSTACK-9154 - Sets the pub interface down when all guest nets are gone
  CLOUDSTACK-9187 - Makes code ready for more something like ethXXXX, if we ever get that
far
  CLOUDSTACK-9188 -  Reads network GC interval and wait from configDao
  CLOUDSTACK-9187 - Fixes interface allocation to VRRP instances
  CLOUDSTACK-9187 - Adds test to cover multiple nics and nic removal
  CLOUDSTACK-9154 - Adds test to cover nics state after GC
  CLOUDSTACK-9154 - Returns the guest iterface that is marked as added

Signed-off-by: Remi Bergsma <github@remi.nl>


> rVPC doesn't recover from cleaning up of network garbage collector
> ------------------------------------------------------------------
>
>                 Key: CLOUDSTACK-9154
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9154
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the default.) 
>          Components: Virtual Router
>    Affects Versions: 4.6.0, 4.7.0, 4.6.1, 4.6.2
>         Environment: ACS 4.7
>            Reporter: Remi Bergsma
>            Assignee: Wilder Rodrigues
>            Priority: Critical
>             Fix For: 4.7.1
>
>
> - deploy a rVPC
> - deploy VM in it
> - make port forwarding (2nd ip, firewall and such)
> - confirm it works
> - stop the vm
> - after some time the network garbage collector will come and tear down the network since
there are no more VMs
> - keepalived will enter FAULT state because of missing eth2 nic (which was first network
tier)
> - all is left is ethic (link local) and lo0
> - then start the vm again
> - the nics get plugged again and keepalived will decide on a new master
> - the nics are screwed up after this:
> ```
> root@r-1021-VM:~# ip a
> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
>     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>     inet 127.0.0.1/8 scope host lo
> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen
1000
>     link/ether 0e:00:a9:fe:02:92 brd ff:ff:ff:ff:ff:ff
>     inet 169.254.2.146/16 brd 169.254.255.255 scope global eth0
> 5: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen
1000
>     link/ether 02:00:18:34:00:05 brd ff:ff:ff:ff:ff:ff
>     inet x.y.238.24/24 brd x.y.238.255 scope global eth1
>     inet 10.0.0.51/24 brd 10.0.0.255 scope global eth1
>     inet 10.0.0.1/24 brd 10.0.0.255 scope global secondary eth1
> 6: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen
1000
>     link/ether 06:d5:4e:00:00:1d brd ff:ff:ff:ff:ff:ff
>     inet x.y.238.25/24 brd x.y.238.255 scope global eth2
>     inet 10.0.0.1/24 brd 10.0.0.255 scope global eth2
> root@r-1021-VM:~#
> ```
> Public and tier ip addresses are mixed up.
> /etc/cloudstack/ips.json has the wrong info:
> ```
> {                                                                                   
                                                                                         
                  [44/959]
>     "eth0": [
>         {
>             "add": true,
>             "broadcast": "169.254.255.255",
>             "cidr": "169.254.2.146/16",
>             "device": "eth0",
>             "gateway": "None",
>             "netmask": "255.255.0.0",
>             "network": "169.254.0.0/16",
>             "nic_dev_id": "0",
>             "nw_type": "control",
>             "one_to_one_nat": false,
>             "public_ip": "169.254.2.146",
>             "size": "16",
>             "source_nat": false
>         }
>     ],
>     "eth1": [
>         {
>             "add": true,
>             "broadcast": "x.y.238.255",
>             "cidr": "x.y.238.24/24",
>             "device": "eth1",
>             "first_i_p": true,
>             "gateway": "x.y.238.1",
>             "netmask": "255.255.255.0",
>             "network": "x.y.238.0/24",
>             "new_nic": false,
>             "nic_dev_id": 1,
>             "nw_type": "public",
>             "one_to_one_nat": false,
>             "public_ip": "x.y.238.24",
>             "size": "24",
>             "source_nat": true,
>             "vif_mac_address": "06:fc:da:00:00:1c"
>         },
>         {
>             "add": true,
>             "broadcast": "10.0.0.255",
>             "cidr": "10.0.0.51/24",
>             "device": "eth1",
>             "gateway": "10.0.0.1",
>             "netmask": "255.255.255.0",
>             "network": "10.0.0.0/24",
>             "nic_dev_id": "1",
>             "nw_type": "guest",
>             "one_to_one_nat": false,
>             "public_ip": "10.0.0.51",
>             "size": "24",
>             "source_nat": false
>         }
>     ],
>     "eth2": [
>         {
>             "add": false,
>             "broadcast": "10.0.0.255",
>             "cidr": "10.0.0.173/24",
>             "device": "eth2",
>             "gateway": "10.0.0.1",
>             "netmask": "255.255.255.0",
>             "network": "10.0.0.0/24",
>             "nic_dev_id": "2",
>             "nw_type": "guest",
>             "one_to_one_nat": false,
>             "public_ip": "10.0.0.173",
>             "size": "24",
>             "source_nat": false
>         },
>         {
>             "add": true,
>             "broadcast": "x.y.238.255",
>             "cidr": "x.y.238.25/24",
>             "device": "eth2",
>             "first_i_p": true,
>             "gateway": "x.y.238.1",
>             "netmask": "255.255.255.0",
>             "network": "x.y.238.0/24",
>             "new_nic": false,
>             "nic_dev_id": 2,
>             "nw_type": "public",
>             "one_to_one_nat": false,
>             "public_ip": "x.y.238.25",
>             "size": "24",
>             "source_nat": true,
>             "vif_mac_address": "06:d5:4e:00:00:1d"
>         }
>     ],
>     "id": "ips"
> ```
> Pinging [~wilder.rodrigues]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message