From dev-return-111682-archive-asf-public=cust-asf.ponee.io@cloudstack.apache.org Mon Jul 9 23:39:28 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 4665F180789 for ; Mon, 9 Jul 2018 23:39:27 +0200 (CEST) Received: (qmail 95685 invoked by uid 500); 9 Jul 2018 21:39:26 -0000 Mailing-List: contact dev-help@cloudstack.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cloudstack.apache.org Delivered-To: mailing list dev@cloudstack.apache.org Received: (qmail 95553 invoked by uid 99); 9 Jul 2018 21:39:25 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 09 Jul 2018 21:39:25 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id EF951C9156 for ; Mon, 9 Jul 2018 21:39:24 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.871 X-Spam-Level: * X-Spam-Status: No, score=1.871 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, KAM_SHORT=0.001, NORMAL_HTTP_TO_IP=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, T_DKIMWL_WL_MED=-0.01] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id Ddt8HuVlT-CW for ; Mon, 9 Jul 2018 21:39:22 +0000 (UTC) Received: from mail-oi0-f66.google.com (mail-oi0-f66.google.com [209.85.218.66]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 0D5B85F27E for ; Mon, 9 Jul 2018 21:39:21 +0000 (UTC) Received: by mail-oi0-f66.google.com with SMTP id q11-v6so13144439oic.12 for ; Mon, 09 Jul 2018 14:39:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=0SLaQaZfS7JczZVSCqQ+1wwtIHBXZWE1nQTHarFJNis=; b=rZsFgujSiSCZ/duSW7U/OPuWtTMn+ksddYFTCrLTwQZN+KrwniVl6dUVXDFSaK3W6M sdjBw+oxo1iZY7vdWE/FL8jCgaEJoExPIC00/V2Tj+bsNPHiClYM+1MZJWFRH7wdCKHV 82tKJj/CB1twgDR26jfoMyJecA78IZnqSGw3YkXJ/6MVN0XaB5lxG9ET6dBK97CmcGjr JmyCfZirxaq+isfTnSHhDdAdMA3ljSOPkOOLmuPSVA3md7BHXDfcI7DRkujizGlVb7BQ t7DFbjUQ/OxnlmVbED+Z8i+F12Pal+qz7B3y7HIG3WDzhjx90cSUrqp/SyOjn7SZKMVq xMkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=0SLaQaZfS7JczZVSCqQ+1wwtIHBXZWE1nQTHarFJNis=; b=BUEb2F81VqPR2cb2RFNTmKvkUxiCfvhckgtW05GEQh8EK4TqdZESRJh/O+tvokxh/s pQujBCHqzXU3n5HLvQ4Z/MgGDZfjCeL2FOSf2iHIosdttXvY2kVPPUsyK3XhF4D949Tr VR5mW6kUpOZ+cYBSvyfXBYY8nSvUMQmchNKvGKjO/swj0Fm0Geb8RPx5Vygfn2By2iuL IhzTDwJFAaAumS2him8D3rrcTgHVRyZ27Xgo4KAkDFvhU19Wxzi/ZgKF97ueCWDQdevG nD+WVL1ARfMRqQSoLxufEDrMPk4xcZuvsmBBNbQOYyc5hutAf3H7o5vdyqTS8McwLHA1 6xQA== X-Gm-Message-State: APt69E0zsq6ZrkphrFM+p3thIG9m75yFezTDeyiFKTRQW8BQJgyrp9j2 2yvR8SZiaL/YcWVtNqKoQGsxul51jMGR+di/3m4D8Q== X-Google-Smtp-Source: AAOMgpcv3BAUL2Nux2Ts8w2wEVIISNu7r44SE+SvK9qli6/U8es54VJMBRJ2ETZlJOYePUsJenca83+U4IwasJ8wCm4= X-Received: by 2002:aca:6ccd:: with SMTP id h196-v6mr25137720oic.298.1531172359522; Mon, 09 Jul 2018 14:39:19 -0700 (PDT) MIME-Version: 1.0 References: <1334766932.382830.1524214290079.JavaMail.zimbra@arhont.com> <309180654.385216.1524221550913.JavaMail.zimbra@arhont.com> <351621998.307078.1531153633010.JavaMail.zimbra@arhont.com> In-Reply-To: <351621998.307078.1531153633010.JavaMail.zimbra@arhont.com> From: Andrija Panic Date: Mon, 9 Jul 2018 23:39:06 +0200 Message-ID: Subject: Re: Upgrade from ACS 4.9.X to 4.11.0 broke VPC source NAT To: dev Content-Type: multipart/alternative; boundary="000000000000944f5f057097d893" --000000000000944f5f057097d893 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Andrei, if not mistaken I believe I saw same behavior even on 4.8 - in our case, what I vaguely remember was, that we configure Port Forwarding instead of Static NAT - it did solve our use case (for some customer), but maybe it's not acceptable for you... Cheers On Mon, 9 Jul 2018 at 18:27, Andrei Mikhailovsky wrote: > Hi Rohit, > > I would like to send you a quick update on this issue. I have recently > upgraded to 4.11.1.0 with the new system vm templates. The issue that I'v= e > described is still present in the latest release. Hasn't it been included > in the latest 4.11 maintenance release? I thought that it would be as it > breaks the major function of the VPC. > > Cheers. > > Andrei > > ----- Original Message ----- > > From: "Andrei Mikhailovsky" > > To: "dev" > > Sent: Friday, 20 April, 2018 11:52:30 > > Subject: Re: Upgrade from ACS 4.9.X to 4.11.0 broke VPC source NAT > > > Thanks > > > > > > > > ----- Original Message ----- > >> From: "Rohit Yadav" > >> To: "dev" , "dev" > > >> Sent: Friday, 20 April, 2018 10:35:55 > >> Subject: Re: Upgrade from ACS 4.9.X to 4.11.0 broke VPC source NAT > > > >> Hi Andrei, > >> > >> I've fixed this recently, please see > >> https://github.com/apache/cloudstack/pull/2579 > >> > >> As a workaround you can add routing rules manually. On the PR, there i= s > a link > >> to a comment that explains the issue and suggests manual workaround. > Let me > >> know if that works for you. > >> > >> Regards. > >> > >> > >> From: Andrei Mikhailovsky > >> Sent: Friday, 20 April, 2:21 PM > >> Subject: Upgrade from ACS 4.9.X to 4.11.0 broke VPC source NAT > >> To: dev > >> > >> > >> Hello, I have been posting to the users thread about this issue. here > is a quick > >> summary in case if people contributing to the source nat code on the > VPC side > >> would like to fix this issue. Problem summary: no connectivity between > virtual > >> machines behind two Static NAT networks. Problem case: When one virtua= l > machine > >> sends a packet to the external address of the another virtual machine > that are > >> handled by the same router and both are behind the Static NAT the > traffic does > >> not work. 10.1.10.100 10.1.10.1:eth2 eth3:10.1.20.1 10.1.20.100 virt1 > router > >> virt2 178.248.108.77:eth1:178.248.108.113 a single packet is send from > virt1 to > >> virt2. stage1: it arrives to the router on eth2 and enters > "nat_PREROUTING" > >> IN=3Deth2 OUT=3D SRC=3D10.1.10.100 DST=3D178.248.108.113) goes through= the "10 > 1K DNAT > >> all -- * * 0.0.0.0/0 178.248.108.113 to:10.1.20.100 " rule and has the > DST > >> DNATED to the internal IP of the virt2 stage2: Enters the FORWARDING > chain and > >> is being DROPPED by the default policy. DROPPED:IN=3Deth2 OUT=3Deth1 > >> SRC=3D10.1.10.100 DST=3D10.1.20.100 The reason being is that the OUT > interface is > >> not correctly changed from eth1 to eth3 during the nat_PREROUTING so > the packet > >> is not intercepted by the FORWARD rule and thus not accepted. "24 14K > >> ACL_INBOUND_eth3 all -- * eth3 0.0.0.0/0 10.1.20.0/24" stage3: manuall= y > >> inserted rule to accept this packet for FORWARDING. the packet enters > the > >> "nat_POSTROUTING" chain IN=3D OUT=3Deth1 SRC=3D10.1.10.100 DST=3D10.1.= 20.100 > and has > >> the SRC changed to the external IP 16 1320 SNAT all -- * eth1 > 10.1.10.100 > >> 0.0.0.0/0 to:178.248.108.77 and is sent to the external network on > eth1. > >> 13:37:44.834341 IP 178.248.108.77 > 10.1.20.100: ICMP echo request, id > 2644, > >> seq 2, length 64 For some reason, during the nat_PREROUTING stage the > DST_IP is > >> changed, but the OUT interface still reflects the interface associated > with the > >> old DST_IP. Here is the routing table # ip route list default via > 178.248.108.1 > >> dev eth1 10.1.10.0/24 dev eth2 proto kernel scope link src 10.1.10.1 > >> 10.1.20.0/24 dev eth3 proto kernel scope link src 10.1.20.1 > 169.254.0.0/16 dev > >> eth0 proto kernel scope link src 169.254.0.5 178.248.108.0/25 dev eth1 > proto > >> kernel scope link src 178.248.108.101 # ip rule list 0: from all looku= p > local > >> 32761: from all fwmark 0x3 lookup Table_eth3 32762: from all fwmark 0x= 2 > lookup > >> Table_eth2 32763: from all fwmark 0x1 lookup Table_eth1 32764: from > 10.1.0.0/16 > >> lookup static_route_back 32765: from 10.1.0.0/16 lookup static_route > 32766: > >> from all lookup main 32767: from all lookup default Further into the > >> investigation, the problem was pinned down to those rules. All the > traffic from > >> internal IP on the static NATed connection were forced to go to the > outside > >> interface (eth1), by setting the mark 0x1 and then using the matching = # > ip rule > >> to direct it. #iptables -t mangle -L PREROUTING -vn Chain PREROUTING > (policy > >> ACCEPT 97 packets, 11395 bytes) pkts bytes target prot opt in out sour= ce > >> destination 49 3644 CONNMARK all -- * * 10.1.10.100 0.0.0.0/0 state NE= W > >> CONNMARK save 37 2720 MARK all -- * * 10.1.20.100 0.0.0.0/0 state NEW > MARK set > >> 0x1 37 2720 CONNMARK all -- * * 10.1.20.100 0.0.0.0/0 state NEW > CONNMARK save > >> 114 8472 MARK all -- * * 10.1.10.100 0.0.0.0/0 state NEW MARK set 0x1 > 114 8472 > >> CONNMARK all -- * * 10.1.10.100 0.0.0.0/0 state NEW CONNMARK save # ip > rule 0: > >> from all lookup local 32761: from all fwmark 0x3 lookup Table_eth3 > 32762: from > >> all fwmark 0x2 lookup Table_eth2 32763: from all fwmark 0x1 lookup > Table_eth1 > >> 32764: from 10.1.0.0/16 lookup static_route_back 32765: from > 10.1.0.0/16 lookup > >> static_route 32766: from all lookup main 32767: from all lookup defaul= t > The > >> acceptable solution is to delete those rules all together.? The proble= m > with > >> such approach is that the inter VPC traffic will use the internal IP > addresses, > >> so the packets going from 178.248.108.77 to 178.248.108.113 would be > seen as > >> communication between 10.1.10.100 and 10.1.20.100 thus we need to appl= y > further > >> two rules # iptables -t nat -I POSTROUTING -o eth3 -s 10.1.10.0/24 -d > >> 10.1.20.0/24 -j SNAT --to-source 178.248.108.77 # iptables -t nat -I > >> POSTROUTING -o eth2 -s 10.1.20.0/24 -d 10.1.10.0/24 -j SNAT --to-sourc= e > >> 178.248.108.113 in order to make sure that the packets leaving the > router would > >> have correct source IP. This way it is possible to have static NAT on > all of > >> the IPS within the VPC and ensure a successful communication between > them. So, > >> for a quick and dirty fix, we ran this command on the VR: for i in > iptables -t > >> mangle -L PREROUTING -vn | awk '/0x1/ && !/eth1/ {print $8}'; do > iptables -t > >> mangle -D PREROUTING -s $i -m state =E2=80=94state NEW -j MARK =E2=80= =94set-mark "0x1" > ; done > >> The issue has been introduced around early 4.9.x releases I believe. > Thanks > >> Andrei > >> rohit.yadav@shapeblue.com > >> www.shapeblue.com > >> 53 Chandos Place, Covent Garden, London WC2N 4HSUK > >> @shapeblue > >> > >> > >> > >> ----- Original Message ----- > From: "Andrei Mikhailovsky" > To: > "users" > Sent: > >> Monday, 16 April, 2018 22:32:25 > Subject: Re: Upgrade from ACS 4.9.3 > to 4.11.0 > >> > Hello, > > I have done some more testing with the VPC network tiers > and it > >> seems that the > Static NAT is indeed causing connectivity issues. Her= e > is what > >> I've done: > > > Setup 1. I have created two test network tiers with > one guest > >> vm in each tier. > Static NAT is NOT enabled. Each VM has a port > forwarding > >> rule (port 22) from > its dedicated public IP address. ACLs have been > setup to > >> allow traffic on port > 22 from the private ip addresses on each > network tier. > >> > > 1. ACLs seems to work just fine. traffic between the networks flow= s > >> according to > the rules. both vms can see each other's private IPs an= d > can > >> ping/ssh/etc > > 2. From the Internet hosts can access vms on port 22 = > > > 4. > >> The vms can also access each other and itself on their public IPs. I > don't > > >> think this worked before, but could be wrong. > > > > Setup 2. > Everything the > >> same as Setup 1, but one public IP address has been > setup as Static > NAT to > >> one guest vm. the second guest vm and second public IP > remained > unchanged. > > >> > 1. ACLs stopped working correctly (see below) > > 2. From the > Internet hosts > >> can access vms on port 22, including the Static NAT > vm > > 3. Other > guest vms > >> can access the Static NAT vm using private & public IP > addresses > > > 4. > >> Static NAT vm can NOT access other vms neither using public nor privat= e > IPs > > > >> 5. Static NAT vm can access the internet hosts (apart from the public > IP range > >> > belonging to the cloudstack setup) > > > The above behaviour of Setu= p > 2 > >> scenarios is very strange, especially points 4 & > 5. > > Any thoughts > anyone? > >> > > Cheers > > ----- Original Message ----- >> From: "Rohit Yadav" >> > To: > >> "users" >> Sent: Thursday, 12 April, 2018 12:06:54 >> Subject: Re: > Upgrade from > >> ACS 4.9.3 to 4.11.0 > >> Hi Andrei, >> >> >> Thanks for sharing, yes > the egress > >> thing is a known issue which is caused due to >> failure during VR > setup to > >> create egress table. By performing a restart of the >> network (withou= t > cleanup > >> option selected), the egress table gets created and >> rules are > successfully > >> applied. >> >> >> The issue has been fixed in the vr downtime pr: >> >= > > >> https://github.com/apache/cloudstack/pull/2508 >> >> >> - Rohit >> >> > >> >> >> > >> >> ________________________________ >> From: Andrei Mikhailovsky >> > Sent: > >> Tuesday, April 3, 2018 3:33:43 PM >> To: users >> Subject: Re: Upgrade > from ACS > >> 4.9.3 to 4.11.0 >> >> Rohit, >> >> Following the update from 4.9.3 to > 4.11.0, I > >> would like to comment on a few >> things: >> >> 1. The upgrade went > well, a > >> part from the cloudstack-management server startup >> issue that I've > described > >> in my previous email. >> 2. there was an issue with the virtual router > template > >> upgrade. The issue is >> described below: >> >> VR template upgrade > issue: >> > >> >> After updating the systemvm template I went onto the Infrastructure= > > >> Virtual >> Routers and selected the Update template option for each > virtual > >> router. The >> virtual routers were updated successfully using the new > >> templates. However, >> this has broken ALL Egress rules on all network= s > and > >> none of the guest vms. >> Port forwarding / incoming rules were workin= g > just > >> fine. Removal and addition >> of Egress rules did not fix the issue. T= o > fix the > >> issue I had to restart each >> of the networks with the Clean up optio= n > ticked. > >> >> >> >> Cheers >> >> Andrei >> >> rohit.yadav@shapeblue.com >> > >> www.shapeblue.com >> 53 Chandos Place, Covent Garden, London WC2N > 4HSUK >> > >> @shapeblue >> >> >> >> ----- Original Message ----- >>> From: "Andrei > >> Mikhailovsky" >>> To: "users" >>> Sent: Monday, 2 April, 2018 21:44:27 > >>> > >> Subject: Re: Upgrade from ACS 4.9.3 to 4.11.0 >> >>> Hi Rohit, >>> >>> > >> Following some further investigation it seems that the installation > packages > >> >>> replaced the following file: >>> >>> > /etc/default/cloudstack-management >>> > >> >>> with >>> >>> /etc/default/cloudstack-management.dpkg-dist >>> >>> > >>> Thus, > >> the management server couldn't load the env variables and thus was > unable >>> > >> to start. >>> >>> I've put the file back and the management server is > able to > >> start. >>> >>> I will let you know if there are any other > issues/problems. >>> > >> >>> Cheers >>> >>> Andrei >>> >>> >>> >>> ----- Original Message ----- > >>>> > >> From: "Andrei Mikhailovsky" >>>> To: "users" >>>> Sent: Monday, 2 > April, 2018 > >> 20:58:59 >>>> Subject: Re: Upgrade from ACS 4.9.3 to 4.11.0 >>> >>>> H= i > Rohit, > >> >>>> >>>> I have just upgraded and having issues starting the service > with the > >> following >>>> error: >>>> >>>> >>>> Apr 02 20:56:37 ais-cloudhost13 > >> systemd[1]: cloudstack-management.service: >>>> Failed to load > environment > >> files: No such file or directory >>>> Apr 02 20:56:37 ais-cloudhost13 > >> systemd[1]: cloudstack-management.service: >>>> Failed to run > 'start-pre' task: > >> No such file or directory >>>> Apr 02 20:56:37 ais-cloudhost13 > systemd[1]: > >> Failed to start CloudStack >>>> Management Server. >>>> -- Subject: Un= it > >> cloudstack-management.service has failed >>>> -- Defined-By: systemd > >>>> >>>> > >> Cheers >>>> >>>> Andrei >>>> >>>> ----- Original Message ----- >>>>> > From: > >> "Rohit Yadav" >>>>> To: "users" >>>>> Sent: Friday, 30 March, 2018 > 19:17:48 > >> >>>>> Subject: Re: Upgrade from ACS 4.9.3 to 4.11.0 >>>> >>>>> Some of > the > >> upgrade and minor issues have been fixed and will make their way >>>>> > into > >> 4.11.1.0. You're welcome to upgrade and share your feedback, but bear > in >>>>> > >> mind due to some changes a new/updated systemvmtemplate need to be > issued for > >> >>>>> 4.11.1.0 (it will be compatible for both 4.11.0.0 and 4.11.1.0 > releases, > >> but >>>>> 4.11.0.0 users will have to register that new template). > >>>>> >>>>> > >> >>>>> >>>>> - Rohit >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> > >> ________________________________ >>>>> From: Andrei Mikhailovsky >>>>> > Sent: > >> Friday, March 30, 2018 11:00:34 PM >>>>> To: users >>>>> Subject: > Upgrade from > >> ACS 4.9.3 to 4.11.0 >>>>> >>>>> Hello, >>>>> >>>>> My current > infrastructure is > >> ACS 4.9.3 with KVM based on Ubuntu 16.04 servers >>>>> for the KVM > hosts and > >> the management server. >>>>> >>>>> I am planning to perform an upgrade > from ACS > >> 4.9.3 to 4.11.0 and was wondering >>>>> if anyone had any issues durin= g > the > >> upgrades? Anything to watch out for? >>>>> >>>>> I have previously see= n > issues > >> with upgrading to 4.10, which required some manual >>>>> db updates > from what I > >> recall. Has this issue been fixed in the 4.11 upgrade >>>>> process? > >>>>> > >> >>>>> thanks >>>>> >>>>> Andrei >>>>> >>>>> rohit.yadav@shapeblue.com > >>>>> > >> www.shapeblue.com >>>>> 53 Chandos Place, Covent Garden, London WC2N > 4HSUK > > > > > > > > @shapeblue > --=20 Andrija Pani=C4=87 --000000000000944f5f057097d893--