cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rohit Yadav <rohit.ya...@shapeblue.com>
Subject Re: [DISCUSS] How to fix failing VR-mgmt server links
Date Sat, 04 Apr 2015 08:45:17 GMT
Hi,

Thanks for your comments everyone.

Wido - I was going to share that little later :) I would also like to avoid Java daemon (like
CPVM/SSVM). Alternatively to gain more throughput, we can implement a webservice (or with
a Thrift interface so that we can have Java based mgmt server call this service using native
bindings) written in Python or Go to keep a smaller process footprint and make it reliable.

Earlier, in large deployments password server was seen as a bottleneck by us and the fix for
this was to upgrade VR memory (to 4-12 GB RAM). If have a look at the new password server
- it is multi-threaded (instead of fork/process based, so less memory consumption) and does
not use file based locks (so operations are faster). After doing this work, I feel there is
lot more to be done. VR to me seems to be one of the fragile pieces that needs to be test-able
and robust.

I was thinking to slowly and gradually move all the services we need to control to be wrapped
in a client at mgmt server’s side which talks to the VR agent that is highly available (let’s
say be controlled by say circus or supervisord), concurrent (tornado/twisted or go based),
fast (connection pooling and multiplexing) and fault tolerant (command journaling or retrying,
some kind of service/network state sync). We can even then run individual services inside
docker (something Sebastien shared in the past), and if that’s possible replace the Debian
base with CoreOS or something else (so it updates critical packages by itself such as openssl
etc and systemvm template is more light weight), and possible more way to control/manage VR
packages.

> On 03-Apr-2015, at 9:39 pm, Suresh Sadhu <Suresh.Sadhu@citrix.com> wrote:
>
> That’s true Somesh ,the recent VR aggregation functionality  yielding  better VR performance
 in the customer location. i.e. VR upgrade time reduces from hours to minute and it's there
in ACS as well(https://issues.apache.org/jira/browse/CLOUDSTACK-5779).  Earlier without this
feature, MS is to ssh  to router to execute each and every commands  but now MS will  ssh
to  router only once and runs all the commands at once.
>
> Rohit : Agent on VR is  good idea but   I believe agent implementation   is heavy for
router . we need to dig more.
>
>
> Regards
> Sadhu
>
> -----Original Message-----
> From: Somesh Naidu [mailto:Somesh.Naidu@citrix.com]
> Sent: 03 April 2015 20:23
> To: dev@cloudstack.apache.org
> Subject: RE: [DISCUSS] How to fix failing VR-mgmt server links
>
> It is true and I like the idea. I would just want to make sure the agent footprint isn't
too high. As opposed to CP/SS VM, we expect to be a lot more VRs running in an environment.
>
> Also, the recent VR aggregation, I believe ACS 4.5 has it, did reduce a lot of that barbaric
stuff so we are still better than where we were earlier.
>
> Somesh
> CloudPlatform Escalations
> Citrix Systems, Inc.
>
>
> -----Original Message-----
> From: Rohit Yadav [mailto:rohit.yadav@shapeblue.com]
> Sent: Friday, April 03, 2015 5:53 AM
> To: dev
> Subject: [DISCUSS] How to fix failing VR-mgmt server links
>
> Hi,
>
> In large environments, one of the issues of a VM deployment or a network rule failing
that I find commonly is that the mgmt server is unable to reach to the VR via the host because
of network lag or issues between the host and the mgmt server. The sending operation on the
link is tried about 5 times (hardcoded) before it gives up and we see something like this
in the logs: "Unable to reach the peer that the agent is connected”.
>
> Should we add a global setting to allow sysadmins to configure the agent link/socket
(in various AgentAttaches in engine/orchestration/src/com/cloud/agent/manager ?) timeout or
please share if something like this already exists or any other solution to this problem?
>
> The other issue I see is that since VRs don’t have an agent running in it, to execute
an operation mgmt server SSH-es into it to run scripts, for a high load the number of open
FDs (so also TCP ports) on a VR/systemvm are limited which again can cause connections to
fail/timeout due to high number of requests VR is processing. A long term solution could be
to implement an agent (like ssvm/cpvm) that runs inside of the VR and talks to mgmt server
over multiplexed connection so we limit the number of connections from one mgmt server and
we can get rid of the SSH code and execution of barbaric scripts. Comments, suggestions, flames?
>
> Regards,
> Rohit Yadav
> Software Architect, ShapeBlue
> M. +91 88 262 30892 | rohit.yadav@shapeblue.com
> Blog: bhaisaab.org | Twitter: @_bhaisaab
>
> Find out more about ShapeBlue and our range of CloudStack related services
>
> IaaS Cloud Design & Build<http://shapeblue.com/iaas-cloud-design-and-build//>
> CSForge – rapid IaaS deployment framework<http://shapeblue.com/csforge/>
> CloudStack Consulting<http://shapeblue.com/cloudstack-consultancy/>
> CloudStack Software Engineering<http://shapeblue.com/cloudstack-software-engineering/>
> CloudStack Infrastructure Support<http://shapeblue.com/cloudstack-infrastructure-support/>
> CloudStack Bootcamp Training Courses<http://shapeblue.com/cloudstack-training/>
>
> This email and any attachments to it may be confidential and are intended solely for
the use of the individual to whom it is addressed. Any views or opinions expressed are solely
those of the author and do not necessarily represent those of Shape Blue Ltd or related companies.
If you are not the intended recipient of this email, you must neither take any action based
upon its contents, nor copy or show it to anyone. Please contact the sender if you believe
you have received this email in error. Shape Blue Ltd is a company incorporated in England
& Wales. ShapeBlue Services India LLP is a company incorporated in India and is operated
under license from Shape Blue Ltd. Shape Blue Brasil Consultoria Ltda is a company incorporated
in Brasil and is operated under license from Shape Blue Ltd. ShapeBlue SA Pty Ltd is a company
registered by The Republic of South Africa and is traded under license from Shape Blue Ltd.
ShapeBlue is a registered trademark.

Regards,
Rohit Yadav
Software Architect, ShapeBlue
M. +91 88 262 30892 | rohit.yadav@shapeblue.com
Blog: bhaisaab.org | Twitter: @_bhaisaab



Find out more about ShapeBlue and our range of CloudStack related services

IaaS Cloud Design & Build<http://shapeblue.com/iaas-cloud-design-and-build//>
CSForge – rapid IaaS deployment framework<http://shapeblue.com/csforge/>
CloudStack Consulting<http://shapeblue.com/cloudstack-consultancy/>
CloudStack Software Engineering<http://shapeblue.com/cloudstack-software-engineering/>
CloudStack Infrastructure Support<http://shapeblue.com/cloudstack-infrastructure-support/>
CloudStack Bootcamp Training Courses<http://shapeblue.com/cloudstack-training/>

This email and any attachments to it may be confidential and are intended solely for the use
of the individual to whom it is addressed. Any views or opinions expressed are solely those
of the author and do not necessarily represent those of Shape Blue Ltd or related companies.
If you are not the intended recipient of this email, you must neither take any action based
upon its contents, nor copy or show it to anyone. Please contact the sender if you believe
you have received this email in error. Shape Blue Ltd is a company incorporated in England
& Wales. ShapeBlue Services India LLP is a company incorporated in India and is operated
under license from Shape Blue Ltd. Shape Blue Brasil Consultoria Ltda is a company incorporated
in Brasil and is operated under license from Shape Blue Ltd. ShapeBlue SA Pty Ltd is a company
registered by The Republic of South Africa and is traded under license from Shape Blue Ltd.
ShapeBlue is a registered trademark.
Mime
View raw message