Return-Path: X-Original-To: apmail-cloudstack-issues-archive@www.apache.org Delivered-To: apmail-cloudstack-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B773F1864E for ; Sun, 17 Jan 2016 18:14:42 +0000 (UTC) Received: (qmail 94936 invoked by uid 500); 17 Jan 2016 18:14:41 -0000 Delivered-To: apmail-cloudstack-issues-archive@cloudstack.apache.org Received: (qmail 94854 invoked by uid 500); 17 Jan 2016 18:14:41 -0000 Mailing-List: contact issues-help@cloudstack.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cloudstack.apache.org Delivered-To: mailing list issues@cloudstack.apache.org Received: (qmail 94741 invoked by uid 500); 17 Jan 2016 18:14:41 -0000 Delivered-To: apmail-incubator-cloudstack-issues@incubator.apache.org Received: (qmail 94713 invoked by uid 99); 17 Jan 2016 18:14:41 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 17 Jan 2016 18:14:41 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 33BF42C1F55 for ; Sun, 17 Jan 2016 18:14:41 +0000 (UTC) Date: Sun, 17 Jan 2016 18:14:41 +0000 (UTC) From: "ASF subversion and git services (JIRA)" To: cloudstack-issues@incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CLOUDSTACK-9154) rVPC doesn't recover from cleaning up of network garbage collector MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CLOUDSTACK-9154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15103830#comment-15103830 ] ASF subversion and git services commented on CLOUDSTACK-9154: ------------------------------------------------------------- Commit ff89587fd119b1cad543d8e96f0c428e41c35840 in cloudstack's branch refs/heads/4.7 from [~remibergsma] [ https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;h=ff89587 ] Merge pull request #1277 from ekholabs/fix/4.7-rvpc-net-gc-CLOUDSTACK-9154 [4.7] Critical VPCVR issues fixed: CLOUDSTACK-9154; CLOUDSTACK-9187; and CLOUDSTACK-9188This PR applies the same fixes as in the PR #1259, but against branch 4.7. Please refer to PR #1259 for the tests results and all the comments already made there. Issues fixed are: * CLOUDSTACK-9154: rVPC doesn't recover from cleaning up of network garbage collector * CLOUDSTACK-9187: rVPC routers in Master/Master due to concurrency problem when writing the keepalivd.conf * CLOUDSTACK-9188: NetworkGarbageCollector is not using gc.interval and gc.wait from settings Those changes have been covered by 2 new tests added to ```smoke/test_vpc_redundant.py```: * test_04_rvpc_network_garbage_collector_nics * test_05_rvpc_multi_tiers The test ```test_04_rvpc_network_garbage_collector_nics``` depends on the global settings for the network.gc.interval and gc.wait. If one wants the test to run quicker, please change the settings (default is 600 seconds for each) and restart the Management Server before running the tests. I would suggest to set it to 60 seconds. In addition, the NetworkGarbageCollector was redefining the settings above mentioned and not reading their values through ConfigDao. Due to that, the settings were not being applied properly and the test was waiting to long to check the VPC routers. * pr/1277: CLOUDSTACK-9154 - Sets the pub interface down when all guest nets are gone CLOUDSTACK-9187 - Makes code ready for more something like ethXXXX, if we ever get that far CLOUDSTACK-9188 - Reads network GC interval and wait from configDao CLOUDSTACK-9187 - Fixes interface allocation to VRRP instances CLOUDSTACK-9187 - Adds test to cover multiple nics and nic removal CLOUDSTACK-9154 - Adds test to cover nics state after GC CLOUDSTACK-9154 - Returns the guest iterface that is marked as added Signed-off-by: Remi Bergsma > rVPC doesn't recover from cleaning up of network garbage collector > ------------------------------------------------------------------ > > Key: CLOUDSTACK-9154 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-9154 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the default.) > Components: Virtual Router > Affects Versions: 4.6.0, 4.7.0, 4.6.1, 4.6.2 > Environment: ACS 4.7 > Reporter: Remi Bergsma > Assignee: Wilder Rodrigues > Priority: Critical > Fix For: 4.7.1 > > > - deploy a rVPC > - deploy VM in it > - make port forwarding (2nd ip, firewall and such) > - confirm it works > - stop the vm > - after some time the network garbage collector will come and tear down the network since there are no more VMs > - keepalived will enter FAULT state because of missing eth2 nic (which was first network tier) > - all is left is ethic (link local) and lo0 > - then start the vm again > - the nics get plugged again and keepalived will decide on a new master > - the nics are screwed up after this: > ``` > root@r-1021-VM:~# ip a > 1: lo: mtu 16436 qdisc noqueue state UNKNOWN > link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 > inet 127.0.0.1/8 scope host lo > 2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000 > link/ether 0e:00:a9:fe:02:92 brd ff:ff:ff:ff:ff:ff > inet 169.254.2.146/16 brd 169.254.255.255 scope global eth0 > 5: eth1: mtu 1500 qdisc pfifo_fast state UP qlen 1000 > link/ether 02:00:18:34:00:05 brd ff:ff:ff:ff:ff:ff > inet x.y.238.24/24 brd x.y.238.255 scope global eth1 > inet 10.0.0.51/24 brd 10.0.0.255 scope global eth1 > inet 10.0.0.1/24 brd 10.0.0.255 scope global secondary eth1 > 6: eth2: mtu 1500 qdisc pfifo_fast state UP qlen 1000 > link/ether 06:d5:4e:00:00:1d brd ff:ff:ff:ff:ff:ff > inet x.y.238.25/24 brd x.y.238.255 scope global eth2 > inet 10.0.0.1/24 brd 10.0.0.255 scope global eth2 > root@r-1021-VM:~# > ``` > Public and tier ip addresses are mixed up. > /etc/cloudstack/ips.json has the wrong info: > ``` > { [44/959] > "eth0": [ > { > "add": true, > "broadcast": "169.254.255.255", > "cidr": "169.254.2.146/16", > "device": "eth0", > "gateway": "None", > "netmask": "255.255.0.0", > "network": "169.254.0.0/16", > "nic_dev_id": "0", > "nw_type": "control", > "one_to_one_nat": false, > "public_ip": "169.254.2.146", > "size": "16", > "source_nat": false > } > ], > "eth1": [ > { > "add": true, > "broadcast": "x.y.238.255", > "cidr": "x.y.238.24/24", > "device": "eth1", > "first_i_p": true, > "gateway": "x.y.238.1", > "netmask": "255.255.255.0", > "network": "x.y.238.0/24", > "new_nic": false, > "nic_dev_id": 1, > "nw_type": "public", > "one_to_one_nat": false, > "public_ip": "x.y.238.24", > "size": "24", > "source_nat": true, > "vif_mac_address": "06:fc:da:00:00:1c" > }, > { > "add": true, > "broadcast": "10.0.0.255", > "cidr": "10.0.0.51/24", > "device": "eth1", > "gateway": "10.0.0.1", > "netmask": "255.255.255.0", > "network": "10.0.0.0/24", > "nic_dev_id": "1", > "nw_type": "guest", > "one_to_one_nat": false, > "public_ip": "10.0.0.51", > "size": "24", > "source_nat": false > } > ], > "eth2": [ > { > "add": false, > "broadcast": "10.0.0.255", > "cidr": "10.0.0.173/24", > "device": "eth2", > "gateway": "10.0.0.1", > "netmask": "255.255.255.0", > "network": "10.0.0.0/24", > "nic_dev_id": "2", > "nw_type": "guest", > "one_to_one_nat": false, > "public_ip": "10.0.0.173", > "size": "24", > "source_nat": false > }, > { > "add": true, > "broadcast": "x.y.238.255", > "cidr": "x.y.238.25/24", > "device": "eth2", > "first_i_p": true, > "gateway": "x.y.238.1", > "netmask": "255.255.255.0", > "network": "x.y.238.0/24", > "new_nic": false, > "nic_dev_id": 2, > "nw_type": "public", > "one_to_one_nat": false, > "public_ip": "x.y.238.25", > "size": "24", > "source_nat": true, > "vif_mac_address": "06:d5:4e:00:00:1d" > } > ], > "id": "ips" > ``` > Pinging [~wilder.rodrigues] -- This message was sent by Atlassian JIRA (v6.3.4#6332)