From dev-return-112302-archive-asf-public=cust-asf.ponee.io@cloudstack.apache.org Fri Dec 28 11:34:46 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 6F83F180652 for ; Fri, 28 Dec 2018 11:34:45 +0100 (CET) Received: (qmail 97721 invoked by uid 500); 28 Dec 2018 10:34:39 -0000 Mailing-List: contact dev-help@cloudstack.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cloudstack.apache.org Delivered-To: mailing list dev@cloudstack.apache.org Received: (qmail 97709 invoked by uid 99); 28 Dec 2018 10:34:38 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 28 Dec 2018 10:34:38 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 5F24BC7077 for ; Fri, 28 Dec 2018 10:34:38 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.3 X-Spam-Level: X-Spam-Status: No, score=0.3 tagged_above=-999 required=6.31 tests=[KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id 3Uj0F-qFQHIn for ; Fri, 28 Dec 2018 10:34:35 +0000 (UTC) Received: from se02-out.mail.pcextreme.nl (se02-out.mail.pcextreme.nl [185.66.251.201]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 399ED5F32E for ; Fri, 28 Dec 2018 10:34:35 +0000 (UTC) Subject: Re: VXLAN and KVm experiences To: dev@cloudstack.apache.org, Ivan Kudryavtsev , Simon Weller References: <2c61704b-7cbc-b868-80ed-eaf39a0b2d17@widodh.nl> <1c497604-c62d-9b67-3acb-620aed40923e@widodh.nl> From: Wido den Hollander Openpgp: preference=signencrypt Autocrypt: addr=wido@widodh.nl; prefer-encrypt=mutual; keydata= xsBNBFPkomgBCADGA8E8Wm2bG2lSTggjk4i6iEHEA6EZJ9Ln2nTIGPg+QbRAZSYuPBtr0d6K kijiFzh0oujoQ5Q6UlK1sp3on7PIsmKeK5K54Ji+is28xPaUAoEVteTb/2XuLon/sobO+fzM v2nrZ63owjQRMUtuR9vJmZ+aODq0WyHUj4bw1WVIL3PBkQ5QuwDA6u5e/UlugvdVf+GMCFOM wOo8mh6IRtYQTqoUkiGydrAM8gFbOTA9rO4bFpbSbiu/e9FbDwdmj370YHFVd6s/wgNtOeKs pQVdWD8tJI8eI8g0L/HYfxD69BTnyI0YPjI1n/aDHRvh0F1usYoTXb2/18pDPNcjVfxvABEB AAHNO1dpZG8gZGVuIEhvbGxhbmRlciAoUENleHRyZW1lIEIuVi4ga2V5KSA8d2lkb0BwY2V4 dHJlbWUubmw+wsB4BBMBAgAiBQJT5KJoAhsDBgsJCAcDAgYVCAIJCgsEFgIDAQIeAQIXgAAK CRB9xvI4O0zu2g9wB/9l6xuaRF1J3gQB7jAg/B2PnOM4KmjoFPMGSMtKs94rLoqmcn5GUD4H JEdSiP5USqh0OnLN6Knb1ZAASWzLOji9QLq+nPI8zjeMXChF2Qf7/qkP75MslH3wBxy16yl2 0yvd7wqZZXbc7vKSkxVMvJdxqf738d+Zc38u0z0cV43h77T3CvxZuEA13WeHK/eHQCXx3sBl zrjfylM0UbIDhntNWe9q5BYtOOQJpfq9t7DQwTQ6m7VFMrFBExP3ZdHIOvFKesrHyGAJLMw+ 8nMeEdWOe9TEsBgmhxny5TJmygNcekuzoaWSknyHn7vwLNSESejs/Vs3/duv/luZWbkpvaq/ zsBNBFPkomgBCACbkn7d8A2z/4691apLM07NyvkXBON7+HPtBm7LFJ2YnVcfc1AaX6d8XVnG s5aKMqaa5+ZVDpvKX0rUE9B8neQQ0UwUaEG8QlSuilBfAbDA1+8NtjIkoo7Vcy0PTJ1kGhgV D4cD98SIT+NpCB0Om9D80O14YP+ES9pkL3XEcixPy7LpLVTVMz2ZH1PXZy/pm7AdSHX/xcKG SctiO2C8jWq0VZdoQSP5hhnf4FOZdhTnp2bZFFgC/5EQ3tTrBMOJiftmOFf5ai5CLffoBRqN 8e8wsVohcdRKEDvMtdKJntncG3pmJIuDMSWQxhM1LrZ7UgeSBbrS+vCdyKplXwdDw/GJABEB AAHCwF8EGAECAAkFAlPkomgCGwwACgkQfcbyODtM7trA2gf/Ydp28gq6PFZZAycM4n4bUQ2p E34E91VBpJZlYGHJWoBbkBgf6eAzkWXZq2sDnnAjxPP9H7RWyPZGH4xRB4U7JdtAD4z46gWT 8qoWvkbwfZlrmxEPkyTIi05msiNYRk6iGOkb5Oob0yp03ROxZRGljiiLzS44BgK9M+n67DxC IlhSiSotHSfljbMUeMj1VXLrmusEw7Dtds5LzON2UZFd/AUJP6zj9GHCpTsvEwacsCdia683 44jzAsFJLduXHdNa9SKlreahe8fGmv8CAtQpD4OuLiDsqzzwkKPI6GAd1MqJQh5AwM0HarPt oDhu3Bo+SVdO5LIKLCmujjBbHZBHIw== Message-ID: <5e18a3f5-11da-6ce1-4e04-cb1f7028d34a@widodh.nl> Date: Fri, 28 Dec 2018 11:34:05 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Originating-IP: 2a00:f10:400:2:425:b2ff:fe00:1c1 X-SpamExperts-Domain: out.pcextreme.nl X-SpamExperts-Username: 2a00:f10:400:2:425:b2ff:fe00:1c1 Authentication-Results: mail.pcextreme.nl; auth=pass smtp.auth=2a00:f10:400:2:425:b2ff:fe00:1c1@out.pcextreme.nl X-SpamExperts-Outgoing-Class: ham X-SpamExperts-Outgoing-Evidence: Combined (0.14) X-Recommended-Action: accept X-Filter-ID: EX5BVjFpneJeBchSMxfU5oMup3hEtrQEiQozKh1ce3B602E9L7XzfQH6nu9C/Fh9KJzpNe6xgvOx q3u0UDjvOygOZaI0TzcFXFER+AJwcQLOiLpze7pDEq6jyJSDNCMUp7idM0JMFhhUYaPj8fV/k9Pp RLD3LVLGXm18r5SlWnI5EmBYHZuctTmV5+9QO5HmQaXPwIlI4yu/PH1NWDdGRdXcFaNRIYVreYLY QUpFIQb+x68890tW+PPtvXhEU0hLslDh9qP4xQ7hbFIQ2VJMkPkA+FBoC47+Sjy9ZmiW2ebanfpn SK4tC1MIkrPY3BeABoaTaJ67iGJuQWO9rGKJb4FRHl52KBHzRtcJcN38UCyDF9PHj7RjzcWDvGe3 oMftZhuQhZIqh1ASJ7C4yE62YlWN/iqvdPR1sQRfJH8tB/olO1rrPPQP0kmZUVqkmp7YpSxP04VS mMgWMN25CInpWEzeIhE7Er1q+6E3cmcuBLeOGtOOp8QYnS07bZ/RAFkaCgu9HA/69MnDISMwNRj/ Ljbx0YOzl4X4R7puxZxSwMSHGvpzzikyn8bGLoxQ/wwRehIqUczFWeS6sE8e1b5/UgKNiD1HAAf+ Jkus//KkDZGYAcEK4doaJAG8UB7ui29ungpNUdexLVq5sqD4zTj+mChLZBQkTtZt+y41B2CkyR4K ak/GrkTZV0HLaIZY1AoP4q+pFQExSc6HNTsLFPLKjSuu2aBbpTJA3uHvQx9KUgGMbIqYcwQ6pzco 6pa7kucDuMCqppU7fnhhu7n2/PcBICi4XeD7dcb3FBkvk11hJbKV6DNp6uqtmcaIfg53KA2RiFqr wMYgR7iLksuXwgJRJLT5yECVMnRMFosvH6GKOQdAYMsfnVv1Cchd103pKtGYEYUSUHfn8MVzzL5E raYQc+heh0yg1EzmrZq7Cdg0dZDfX4B1anYWshSX69Q9HT/P3i00ZQEMyDANN6LZI4LRPrE9bVBz Dwrhh3niy+QINJrrjuU3A3io6xJmD59iFVkbJDDBKv7/i1Et5a1scUo7Jz0EzoobKAjDVEqPYUDg dv9sDmrOdyT3mi03nVjHMTM0H3kodhFKicUGVyO//M3PdB/4XE78yORmwfcBRtboRGx7xzjlW1Vo paFGpcyuP7ySkDf2x6IgPpvUhqW7ROksgjEvuGslKTrRIXcXpFg5ivY= X-Report-Abuse-To: spam@semaster01.mail.pcextreme.nl On 10/23/18 2:54 PM, Ivan Kudryavtsev wrote: > Doesn't solution like this works seamlessly for large VXLAN networks? > > https://vincent.bernat.ch/en/blog/2017-vxlan-bgp-evpn > This is what we are looking into right now. As CloudStack executes *modifyvxlan.sh* prior to starting an Instance it would be just a matter of replacing this script with a version which does the EVPN for us. Our routers will probably be 36x100G SuperMicro Bare Matel switches running Cumulus. Using unnumbered BGP over IPv6 we'll provide network connectivity to the Hypervisors. Using FFR and EVPN we'll be able to enable VXLAN on the hypervisors and route traffic. As these things seem to be very use-case specific I don't see how we can integrate this into CloudStack in a generic way. The *modifyvxlan.sh* script gets the VNI as a argument, so anybody can adapt it to their own needs for their specific environment. Wido > вт, 23 окт. 2018 г., 8:34 Simon Weller : > >> Linux native VXLAN uses multicast and each host has to participate in >> multicast in order to see the VXLAN networks. We haven't tried using PIM >> across a L3 boundary with ACS, although it will probably work fine. >> >> Another option is to use a L3 VTEP, but right now there is no native >> support for that in CloudStack's VXLAN implementation, although we've >> thought about proposing it as feature. >> >> >> ________________________________ >> From: Wido den Hollander >> Sent: Tuesday, October 23, 2018 7:17 AM >> To: dev@cloudstack.apache.org; Simon Weller >> Subject: Re: VXLAN and KVm experiences >> >> >> >> On 10/23/18 1:51 PM, Simon Weller wrote: >>> We've also been using VXLAN on KVM for all of our isolated VPC guest >> networks for quite a long time now. As Andrija pointed out, make sure you >> increase the max_igmp_memberships param and also put an ip address on each >> interface host VXLAN interface in the same subnet for all hosts that will >> share networking, or multicast won't work. >>> >> >> Thanks! So you are saying that all hypervisors need to be in the same L2 >> network or are you routing the multicast? >> >> My idea was that each POD would be an isolated Layer 3 domain and that a >> VNI would span over the different Layer 3 networks. >> >> I don't like STP and other Layer 2 loop-prevention systems. >> >> Wido >> >>> >>> - Si >>> >>> >>> ________________________________ >>> From: Wido den Hollander >>> Sent: Tuesday, October 23, 2018 5:21 AM >>> To: dev@cloudstack.apache.org >>> Subject: Re: VXLAN and KVm experiences >>> >>> >>> >>> On 10/23/18 11:21 AM, Andrija Panic wrote: >>>> Hi Wido, >>>> >>>> I have "pioneered" this one in production for last 3 years (and >> suffered a >>>> nasty pain of silent drop of packages on kernel 3.X back in the days >>>> because of being unaware of max_igmp_memberships kernel parameters, so I >>>> have updated the manual long time ago). >>>> >>>> I never had any issues (beside above nasty one...) and it works very >> well. >>> >>> That's what I want to hear! >>> >>>> To avoid above issue that I described - you should increase >>>> max_igmp_memberships (/proc/sys/net/ipv4/igmp_max_memberships) - >> otherwise >>>> with more than 20 vxlan interfaces, some of them will stay in down state >>>> and have a hard traffic drop (with proper message in agent.log) with >> kernel >>>>> 4.0 (or I silent, bitchy random packet drop on kernel 3.X...) - and >> also >>>> pay attention to MTU size as well - anyway everything is in the manual >> (I >>>> updated everything I though was missing) - so please check it. >>>> >>> >>> Yes, the underlying network will all be 9000 bytes MTU. >>> >>>> Our example setup: >>>> >>>> We have i.e. bond.950 as the main VLAN which will carry all vxlan >> "tunnels" >>>> - so this is defined as KVM traffic label. In our case it didn't make >> sense >>>> to use bridge on top of this bond0.950 (as the traffic label) - you can >>>> test it on your own - since this bridge is used only to extract child >>>> bond0.950 interface name, then based on vxlan ID, ACS will provision >>>> vxlanYYY@bond0.xxx and join this new vxlan interface to NEW bridge >> created >>>> (and then of course vNIC goes to this new bridge), so original bridge >> (to >>>> which bond0.xxx belonged) is not used for anything. >>>> >>> >>> Clear, I indeed thought something like that would happen. >>> >>>> Here is sample from above for vxlan 867 used for tenant isolation: >>>> >>>> root@hostname:~# brctl show brvx-867 >>>> >>>> bridge name bridge id STP enabled interfaces >>>> brvx-867 8000.2215cfce99ce no vnet6 >>>> >>>> vxlan867 >>>> >>>> root@hostname:~# ip -d link show vxlan867 >>>> >>>> 297: vxlan867: mtu 8142 qdisc noqueue >>>> master brvx-867 state UNKNOWN mode DEFAULT group default qlen 1000 >>>> link/ether 22:15:cf:ce:99:ce brd ff:ff:ff:ff:ff:ff promiscuity 1 >>>> vxlan id 867 group 239.0.3.99 dev bond0.950 port 0 0 ttl 10 ageing >> 300 >>>> >>>> root@ix1-c7-2:~# ifconfig bond0.950 | grep MTU >>>> UP BROADCAST RUNNING MULTICAST MTU:8192 Metric:1 >>>> >>>> So note how the vxlan interface has by 50 bytes smaller MTU than the >>>> bond0.950 parent interface (which could affects traffic inside VM) - so >>>> jumbo frames are needed anyway on the parent interface (bond.950 in >> example >>>> above with minimum of 1550 MTU) >>>> >>> >>> Yes, thanks! We will be using 1500 MTU inside the VMs, so all the >>> networks underneath will be ~9k. >>> >>>> Ping me if more details needed, happy to help. >>>> >>> >>> Awesome! We'll be doing a PoC rather soon. I'll come back with our >>> experiences later. >>> >>> Wido >>> >>>> Cheers >>>> Andrija >>>> >>>> On Tue, 23 Oct 2018 at 08:23, Wido den Hollander >> wrote: >>>> >>>>> Hi, >>>>> >>>>> I just wanted to know if there are people out there using KVM with >>>>> Advanced Networking and using VXLAN for different networks. >>>>> >>>>> Our main goal would be to spawn a VM and based on the network the NIC >> is >>>>> in attach it to a different VXLAN bridge on the KVM host. >>>>> >>>>> It seems to me that this should work, but I just wanted to check and >> see >>>>> if people have experience with it. >>>>> >>>>> Wido >>>>> >>>> >>>> >>> >> >