cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrija Panic <andrija.pa...@gmail.com>
Subject Re: Re:Re: VMs Connection break under two isolate network
Date Sun, 24 Feb 2019 20:22:01 GMT
Hi,

in general yes it may make sense, a few years ago I was hitting same issue
with 10GB Intel NICs, but with KVM on Ubuntu 14 due to either issues with
kernel or the Intel driver itself - TSO and LRO should be disabled with
routing and bridging, since it's incompatible in those scenarios. ( i.e.
https://downloadmirror.intel.com/14687/eng/readme.txt and search for
"incompatible" , or ,
http://ehaselwanter.com/en/blog/2014/11/02/mtu-issue--nope-it-is-lro-with-bridge-and-bond/
).
i.e. LRO/TSO should be automatically turned off when you add NIC to a
bridge (i.e. CentOS6 was fine, but Ubuntu 14 had issues)...

Cheers
Andrija

On Sun, 24 Feb 2019 at 08:41, Haijiao <18602198181@163.com> wrote:

> Hi, Dag and All
>
>
> Yes, we are using active-active(mode 7) for bond.
>
> VM A1 ---> VR A(Isloated Network A) ----> VR B(Isolated Network B)  ---->
> VM B1
>
>
>
> After rounds of isoloation,  based on packet analysis,  it seems to us
>     - the traffic between  VM A1 and VR A is normal
>     - however, between  VR A and VM B1,   VR A receives packets
> aknowledges from VM B1 which VR A thinks they has not sent thru it yet.
>     - Then,  VR A reset the session, causing the traffic dropped.
>
>
> For  testing purpose, we turned off the TSO (tcp-segmentation-offload )on
> XenServer network adpaters by command 'ethtool -k eth0 tso off',   the
> issue is just gone, we can run iperf for testing without any drop for a
> couple of hours.
>
>
> Does it make sense ?  Any improvement can be implemented from ACS side ?
>
>
> Thanks !
>
>
>
>
> 在2019年02月22 23时20分, "Haijiao"<18602198181@163.com>写道:
>
>
> Thanks Dag,  you are always helpful !
>
>
> We will look into your sharing and come back.
>
>
>
>
>
>
>
> 在2019年02月22 17时26分, "Dag Sonstebo"<Dag.Sonstebo@shapeblue.com>写道:
>
> Hi Haijiao,
>
> We've come across similar things in the past. In short - what is your
> XenServer bond mode? Is it active-active (mode 7) or LACP (mode 4)? (see
> https://support.citrix.com/article/CTX137599)
>
> In short if your switches don't keep up with MAC address changes on the XS
> hosts then you will get traffic flapping with intermittent loss of
> connectivity (root cause is a MAC address moves to another uplink, but the
> switch only checks for changes every X seconds so it takes a while for it
> to catch up). LACP mode 4 has a much more robust mechanism for this but
> obviously needs configured both XS and switch end. Normal active-active
> (mode 7) seems to always cause problems.
>
> My general advise would be to simplify and just go active-passive (mode 1)
> - unless you really need the bandwidth this gives you a much more stable
> network backend.
>
> Regards,
> Dag Sonstebo
> Cloud Architect
> ShapeBlue
>
>
> On 22/02/2019, 07:14, "Haijiao" <18602198181@163.com> wrote:
>
>    Hi, Devs and Community Users
>
>
>    To be more specific,  our environment is built with
>    * 2 Dell R740XD Servers + Dell Compellent Storage w/ iSCSI
>    * Each server equiped with two Mellanox Connect-4 LX 25GbE network
> adapters, and configured with bond mode(active+active) in XenServer
>    * CloudStack 4.11.2 LTS + XenServer 7.1CU2(LTS) Enterprise
>
>
>    Everything goes fine with shared network, but the weird thing is if we
> setup 2 isolated networks,  try to use 'iperf',  'wget' or 'SCP' to test
> the network performance betwen two VMs located in these 2 isolated
> networks,  the traffic will drop to zero in about 200-300 seconds,  even
> though we were still able to ping or SSH VM B1 from A1 or verse.
>
>
>    VM A1 ---> VR A(Isloated Network A) ----> VR B(Isolated Network B)
> ----> VM B1
>
>  ----------------------------------------------------------------------------------------------------------------------------------------
>    We have checked the configuration on switches, upgraded Mellanox driver
> for XenServer,  but no luck.
>    Meanwhile, we can not re-produce this issue in another environment
> (XenServer 7.1CU2+ACS 4.11.2+ Intel Gb network).
>
>
>     It seems it might be related to Mellanox adapter, but we have no idea
> what part we could possibly miss in this case.
>
>
>    Any advice would be highly appreciated !   Thank you !
>
>
>    在2019年02月22 13时09分, "gu haven"<gumingda@hotmail.com>写道:
>
>
>    hi ,all
>          I try iperf wget scp connection will break after 200 seconds ,Do
> need any optimization in vr ?
>
>    environment infomation below:
>
>    cloudstack 4.11.2
>
>    xenserver 7.1 CU2 Enterprise
>
>    NIC :MLNX 25GbE 2P ConnectX4LX
>
>    bond mode in xenserver : acitce-active
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Dag.Sonstebo@shapeblue.com
> www.shapeblue.com
> Amadeus House, Floral Street, London  WC2E 9DPUK
> @shapeblue
>
>
>
>
>
>
>
>
>



-- 

Andrija Panić

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message