cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wido den Hollander <w...@widodh.nl>
Subject Re: Very slow Virtual Router provisioning with 4.9.2.0
Date Thu, 04 May 2017 10:12:13 GMT
Hi,

Yes, we are working on a few low hanging fruit fixes. Like checking if the last restart of
dnsmasq was < 10 sec ago. If so, skip the restart.

Will report back once we have anything.

Wido

> Op 4 mei 2017 om 11:11 schreef Wei ZHOU <ustcweizhou@gmail.com>:
> 
> 
> Hi Wido,
> 
> A simple improvement is, donot wait while restarting dnsmasq service in VR.
> 
> 
> '''
> diff --git a/systemvm/patches/debian/config/opt/cloud/bin/cs/CsDhcp.py
> b/systemvm/patches/debian/config/opt/cloud/bin/cs/CsDhcp.py
> index 95d2eff..999be8f 100755
> --- a/systemvm/patches/debian/config/opt/cloud/bin/cs/CsDhcp.py
> +++ b/systemvm/patches/debian/config/opt/cloud/bin/cs/CsDhcp.py
> @@ -59,7 +59,7 @@ class CsDhcp(CsDataBag):
> 
>          # We restart DNSMASQ every time the configure.py is called in
> order to avoid lease problems.
>          if not self.cl.is_redundant() or self.cl.is_master():
> -            CsHelper.service("dnsmasq", "restart")
> +            CsHelper.execute3("service dnsmasq restart")
> 
>      def configure_server(self):
>          # self.conf.addeq("dhcp-hostsfile=%s" % DHCP_HOSTS)
> diff --git a/systemvm/patches/debian/config/opt/cloud/bin/cs/CsHelper.py
> b/systemvm/patches/debian/config/opt/cloud/bin/cs/CsHelper.py
> index a8ccea2..b06bde3 100755
> --- a/systemvm/patches/debian/config/opt/cloud/bin/cs/CsHelper.py
> +++ b/systemvm/patches/debian/config/opt/cloud/bin/cs/CsHelper.py
> @@ -191,6 +191,11 @@ def execute2(command):
>      p.wait()
>      return p
> 
> +def execute3(command):
> +    """ Execute command """
> +    logging.debug("Executing: %s" % command)
> +    p = subprocess.Popen(command, stdout=subprocess.PIPE,
> stderr=subprocess.PIPE, shell=True)
> +    return p
> 
>  def service(name, op):
>      execute("service %s %s" % (name, op))
> '''
> 
> -Wei
> 
> 
> 2017-05-04 10:48 GMT+02:00 Wido den Hollander <wido@widodh.nl>:
> 
> > Thanks Daan, Remi.
> >
> > I found a additional bug where it seems that 'network.dns.basiczone.updates'
> > isn't read when sending DHCP settings in Basic Networking.
> >
> > This means that the VR gets all DHCP setting for the whole zone instead of
> > just for that POD.
> >
> > In this case some VRs we have get ~2k of DHCP offerings send to them which
> > causes a large slowdown.
> >
> > Wido
> >
> > > Op 3 mei 2017 om 14:49 schreef Daan Hoogland <daan.hoogland@gmail.com>:
> > >
> > >
> > > Happy to pick this up, Remi. I'm travelling now but will look at both on
> > > Friday.
> > >
> > > Biligual auto correct use.  Read at your own risico
> > >
> > > On 3 May 2017 2:25 pm, "Remi Bergsma" <RBergsma@schubergphilis.com>
> > wrote:
> > >
> > > > Always happy to share, but I won’t have time to work on porting this
to
> > > > CloudStack any time soon.
> > > >
> > > > Regards, Remi
> > > >
> > > >
> > > > On 03/05/2017, 13:44, "Rohit Yadav" <rohit.yadav@shapeblue.com>
wrote:
> > > >
> > > >     Hi Remi, thanks for sharing. We would love to have those changes
> > (for
> > > > 4.9+), looking forward to your pull requests.
> > > >
> > > >
> > > >     Regards.
> > > >
> > > >     ________________________________
> > > >     From: Remi Bergsma <RBergsma@schubergphilis.com>
> > > >     Sent: 03 May 2017 16:58:18
> > > >     To: dev@cloudstack.apache.org
> > > >     Subject: Re: Very slow Virtual Router provisioning with 4.9.2.0
> > > >
> > > >     Hi,
> > > >
> > > >     The patches I talked about:
> > > >
> > > >     1) Iptables speed improvement
> > > >     https://github.com/apache/cloudstack/pull/1482
> > > >     Was reverted due to a licensing issue.
> > > >
> > > >     2) Passwd speed improvement
> > > >     https://github.com/MissionCriticalCloudOldRepos/
> > cosmic-core/pull/138
> > > >
> > > >     By now, these are rather old patches so they need some work before
> > > > they apply to CloudStack again.
> > > >
> > > >     Regards, Remi
> > > >
> > > >
> > > >
> > > >     On 03/05/2017, 12:49, "Jeff Hair" <jeff@greenqloud.com> wrote:
> > > >
> > > >         Hi Remi,
> > > >
> > > >         Do you have a link to the PR that was reverted? And also
> > possibly
> > > > the code
> > > >         that makes the password updating more efficient?
> > > >
> > > >         Jeff
> > > >
> > > >         On Wed, May 3, 2017 at 10:36 AM, Remi Bergsma <
> > > > RBergsma@schubergphilis.com>
> > > >         wrote:
> > > >
> > > >         > Hi Wido,
> > > >         >
> > > >         > When we had similar issues last year, we found that for
> > example
> > > > comparing
> > > >         > the iptables rules one-by-one is 1000x slower than simply
> > > > loading them all
> > > >         > at once. Boris rewrote this part in our Cosmic fork, may
be
> > > > worth looking
> > > >         > into this again. The PR to CloudStack was merged, but
> > reverted
> > > > later, can't
> > > >         > remember why. We run it in production ever since. Also
> > feeding
> > > > passwords to
> > > >         > the passwd server is very inefficient (it operates like a
> > > > snowball and gets
> > > >         > slower once you have more VMs). That we also fixed in Cosmic,
> > > > not sure if
> > > >         > that patch made it upstream. Wrote it about a year ago
> > already.
> > > >         >
> > > >         > We tested applying 10K iptables rules in just a couple of
> > > > seconds. 1000
> > > >         > VMs takes a few minutes to deploy.
> > > >         >
> > > >         > Generally speaking I'd suggest looking at the logs to find
> > what
> > > > takes long
> > > >         > or is executed a lot of times. Iptables and passwd are two
to
> > > > look at.
> > > >         >
> > > >         > If you want I can lookup the patches. Not handy on my phone
> > now
> > > > ;-)
> > > >         >
> > > >         > Regards, Remi
> > > >         > ________________________________
> > > >         > From: Wido den Hollander <wido@widodh.nl>
> > > >         > Sent: Tuesday, May 2, 2017 7:57:08 PM
> > > >         > To: dev@cloudstack.apache.org
> > > >         > Subject: Very slow Virtual Router provisioning with 4.9.2.0
> > > >         >
> > > >         > Hi,
> > > >         >
> > > >         > Last night I upgraded a CloudStack 4.5.2 setup to 4.9.2.0.
> > All
> > > > went well,
> > > >         > but the VR provisioning is terribly slow which causes all
> > kinds
> > > > of problems.
> > > >         >
> > > >         > The vr_cfg.sh and update_config.py scripts start to run.
> > Restart
> > > > dnsmasq,
> > > >         > add metadata, etc.
> > > >         >
> > > >         > But for just 1800 hosts this can take up to 2 hours and that
> > > > causes
> > > >         > timeouts in the management server and other problems.
> > > >         >
> > > >         > 2 hours is just very, very slow. So I am starting to wonder
> > if
> > > > something
> > > >         > is wrong here.
> > > >         >
> > > >         > Did anybody else see this?
> > > >         >
> > > >         > Running Basic Networking with CloudStack 4.9.2.0
> > > >         >
> > > >         > Wido
> > > >         >
> > > >
> > > >
> > > >
> > > >
> > > >     rohit.yadav@shapeblue.com
> > > >     www.shapeblue.com
> > > >     53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> > > >     @shapeblue
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> >

Mime
View raw message