cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rafael Weingärtner <rafaelweingart...@gmail.com>
Subject Re: [STABILITY]} Large KVM Infrastructure with ACS
Date Thu, 19 Nov 2015 22:31:44 GMT
How many MS do you have in your environment?

On Thu, Nov 19, 2015 at 7:56 PM, Paul Angus <paul.angus@shapeblue.com>
wrote:

> Hi,
>
> In the past a couple of clients of our have had issues with indirect
> agents (KVM hosts and system VMs) connecting over port 8250, particularly
> if connectivity was lost to the management server(s). They both had 300+
> indirect agents active.
>
> In these circumstances we have found that running a netstat to see
> connections to port 8250 on the mgmt server(s) revealed many open but
> unused connections to port 8250.
>
> I recall at one time we found the agent connection code had been altered
> to attempt to reconnect it the connection didn't complete with 10secs.
> However the failed connection would take 60 seconds to time out.
>
> Another time we found that management server and mysql db were both being
> starved of enough connections to the mysql db to process the reconnections
> faster enough. The default from the mgmt server is 100 connections and the
> documented setting for mysql is 350 connections.  However external
> connections (and additional mgmt servers)  require these to be adjusted.
>
> -- just some ideas...
>
>
> Regards,
>
> Paul Angus
> VP Technology/Cloud Architect
> S: +44 20 3603 0540 | M: +447711418784 | T: CloudyAngus
> paul.angus@shapeblue.com
>
> -----Original Message-----
> From: ilya [mailto:ilya.mailing.lists@gmail.com]
> Sent: 19 November 2015 20:32
> To: dev@cloudstack.apache.org
> Subject: Re: [STABILITY]} Large KVM Infrastructure with ACS
>
> Rafael,
>
> Please see response in-line:
>
> On 11/18/15 4:16 PM, Rafael Weingärtner wrote:
> > When you say 250+, you mean 250+ host spread in lots of cluster, right?
> > If I am not mistaken, ACS limits the number of KVM hosts in a cluster,
> > something like 50? I do not remember now if that value can be
> > configured, may it can be.
>
> Yes lots of clusters, way less than 50 per cluster.
>
> > I recall to have read something in a Red Hat doc about the KVM that it
> > does not have limit of hosts in a cluster. Actually, it does not seem
> > to have the figure of cluster at all. That is created solely in ACS,
> > to facilitate the management.
> >
> > To debug the problem, I would start with the following questions:
> >
> > Is every single cluster of your environment is presenting that problem?
>
> No, few clusters with some nodes within the cluster - not all.
>
> > What is the size of physical hosts that you have in your environment?
> > Do all of them have the same configuration?
> Yes, all hosts have the same configuration. Cant go into details, but its
> rather large.
>
> > Do you know the load (resource allocated and used) that is being
> > imposed in those hosts that had shown those problems?
> > What is your over commitment/provisioning factor that you are using?
> Servers are not heavily taxed, we dont over commit memory, other
> components could be over committed by 2 or less. Overall, we still have
> capacity to accommodate more VMs if needed, we just don't max it out.
>
> ----
>
> Both Marcus and myself are looking through this, it could be just our
> specific implementation - hence, I wanted to see if anyone else in the
> community with heavy KVM usage came across this issue.
>
> Maybe I need to ping LeaseWeb and ExtremePC folks..
>
> Thanks,
> ilya
> >
> > On Wed, Nov 18, 2015 at 8:19 PM, Daan Hoogland
> > <daan.hoogland@gmail.com>
> > wrote:
> >
> >> sounds like a bad limit Ilya, i'll keep an eye out.
> >>
> >> On Wed, Nov 18, 2015 at 10:10 PM, ilya <ilya.mailing.lists@gmail.com>
> >> wrote:
> >>
> >>> I'm curious if anyone runs ACS with atleast 250+ KVM hosts.
> >>>
> >>> We've been noticing weird issues with KVM where occasionally lots of
> >>> KVM agents get Nio connection closed issue followed by barrage of
> alerts.
> >>>
> >>> In some instances the agent reconnects right away and in other - it
> >>> attempts to reconnect but never receives an ACK from MS.
> >>>
> >>> Please let me know if you notice anything like it and if you found a
> >>> solution.
> >>>
> >>> Also, it would help to know what global settings have been tuned to
> >>> make things work better (aside from direct.agent.*) and how MS are
> running.
> >>>
> >>> Thanks
> >>> ilya
> >>>
> >>
> >>
> >>
> >> --
> >> Daan
> >>
> >
> >
> >
> Find out more about ShapeBlue and our range of CloudStack related services
>
> IaaS Cloud Design & Build<
> http://shapeblue.com/iaas-cloud-design-and-build//>
> CSForge – rapid IaaS deployment framework<http://shapeblue.com/csforge/>
> CloudStack Consulting<http://shapeblue.com/cloudstack-consultancy/>
> CloudStack Software Engineering<
> http://shapeblue.com/cloudstack-software-engineering/>
> CloudStack Infrastructure Support<
> http://shapeblue.com/cloudstack-infrastructure-support/>
> CloudStack Bootcamp Training Courses<
> http://shapeblue.com/cloudstack-training/>
>
> This email and any attachments to it may be confidential and are intended
> solely for the use of the individual to whom it is addressed. Any views or
> opinions expressed are solely those of the author and do not necessarily
> represent those of Shape Blue Ltd or related companies. If you are not the
> intended recipient of this email, you must neither take any action based
> upon its contents, nor copy or show it to anyone. Please contact the sender
> if you believe you have received this email in error. Shape Blue Ltd is a
> company incorporated in England & Wales. ShapeBlue Services India LLP is a
> company incorporated in India and is operated under license from Shape Blue
> Ltd. Shape Blue Brasil Consultoria Ltda is a company incorporated in Brasil
> and is operated under license from Shape Blue Ltd. ShapeBlue SA Pty Ltd is
> a company registered by The Republic of South Africa and is traded under
> license from Shape Blue Ltd. ShapeBlue is a registered trademark.
>



-- 
Rafael Weingärtner

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message