cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Kinsella <>
Subject Re: [PROPOSAL] Service monitoring tool in virtual router
Date Thu, 07 Nov 2013 00:55:53 GMT
Thx for putting this together, Jayapal. A few comments:

I'd really like to have a config flag to specify if things should be restarted automatically
or not. Worst case, track the restarts - if a service is restarted more than X times in Y
seconds, something's obviously wrong so stop tail-chasing[1]. Personally I'm much more interested
in knowing there's a problem and then taking whatever happens to be the appropriate actions
for our situation.

Regarding communicating with a monitoring system - what makes more sense to me is setting
up a solid framework that provides folks flexibility to use various monitoring tools, from
sending an email to contacting pager duty or whatever.

So, to me there's 3 parts to that:
1) At VR creation, ACS calls defined hook-script which knows how to contact monitoring system
to tell it about system to monitor
2) At boot, VR sends API query to which the mgmt server responds with a URL for an install
script - VR runs that to download/setup appropriate monitoring agent
3) VR has standardized scripts for agent to call to find out what should be running, and then
agent can go check for itself.

With a setup like this, you can support SNMP, Opsview/Nagios, Monit, NSA, Zenoss, HPOV, Tivoli,
etc etc etc. I'll happily write the Opsview/Nagios module (I'm thinking module is hosted outside
ACS, but I guess it could be a plugin - see earlier licensing points).


Just my 2c. Happy to tweak wiki if folks lean towards this.

1: Aside - this applies to SSVM creation currently - that hamster[2] keeps trying to spin
that create SSVM wheel..
2: Apache CloudHamster, CloudMonkey's furry monitoring friend?

On Nov 6, 2013, at 7:58 AM, Jayapal Reddy Uradi <> wrote:

> Please find below update FS
> Thanks,
> Jayapal
> On 05-Oct-2013, at 6:54 PM, Santhosh Edukulla <> wrote:
>> A shell script can be used. Few thoughts below:
>> 1. Collect the process id of all daemons you wanted to monitor using "pidof" of command
and then use "kill" command to check if the pid you got is valid. Using kill we can send a
signal 0, then check the status using echo $? . For sending a notification use linux syslog
call ( man 3 syslogd) or "logger" command to send to syslog. If wanted to send email then
you may also have to look for firewall not allowing outbound smtp port communiation. Even
for snmp this holds same( i mean if any blocking through firewall rules ).  Using syslog may
be good as it by default exposes various debug log levels through its api call.
>> Now, to keep the monitor script up always up and runninig. Keep the monitor script
run continuosly through cron or at at regular\scheduled intervals. This way even if monitor
script goes down, the next xth interval, it is up again. 
>> With this there is a catch though, we may got multiple pids for a given daemon provided
if there are multiple daemons spawned by same\multiple applications, if this scenario is not
common then its ok, otherwise we may have to track it differently maintaining state of each
spawned daemon and see if it exists. If multiple applications launch the same daemon, you
may also wanted to say its application which got killed. EX: A launched httpd, and during
its exit logic, it is killing all daemons it launched, then you may wanted to add  A is not
available, rather than just http is not available. 
>> 2.  Using  netstat command : Check for available, listening and active ports on local
host, provided all the daemons you wanted to monitor are running on "standard" ports or if
we know the listening ports of those deamons to be monitored. Again, this script can be added
through cron\at to be scheduled to run x units, if it gets killed the next x units after the
monitor script is up again. 
>> Also, there could be many other approaches as well.
>> Thanks!
>> Santhosh 
>> ________________________________________
>> From: Jayapal Reddy Uradi []
>> Sent: Saturday, October 05, 2013 5:17 AM
>> To: <>
>> Cc: <>
>> Subject: Re: [PROPOSAL] Service monitoring tool in virtual router
>> Hi,
>> +users list
>> If any one is already using any tools for monitoring then please share your ideas.
>> Also share the cases where you experienced service crashes.
>> Thanks,
>> Jayapal
>> On 05-Oct-2013, at 4:12 AM, Chiradeep Vittal <>
>>> Well just make sure that your script is resilient to its own crashes as
>>> well.
>>> On 10/4/13 1:59 AM, "Jayapal Reddy Uradi" <>
>>> wrote:
>>>> Hi,
>>>> I am planning to write script utility to monitor processes and restart on
>>>> the event of failure. It will also logs the events.
>>>> Thanks,
>>>> Jayapal
>>>> On 02-Oct-2013, at 3:25 AM, Simon Weller <> wrote:
>>>>> supervisord maybe?
>>>>> ----- Original Message -----
>>>>> From: "Chiradeep Vittal" <>
>>>>> To:
>>>>> Sent: Tuesday, October 1, 2013 4:45:56 PM
>>>>> Subject: Re: [PROPOSAL] Service monitoring tool in virtual router
>>>>> Got it. Any other OSS tool out there similar to monit?
>>>>> On 10/1/13 8:24 AM, "David Nalley" <> wrote:
>>>>>> On Thu, Sep 26, 2013 at 1:27 AM, Chiradeep Vittal
>>>>>> <> wrote:
>>>>>>> SNMP wouldn't restart a failed process nor would it generate
>>>>>>> It
>>>>>>> is
>>>>>>> simply too generic for the requirements outlined here. The proposal
>>>>>>> does
>>>>>>> not talk about modifying monit, just using it. That wouldn't
>>>>>>> the
>>>>>>> AGPL.
>>>>>> Let me restate my objection to anything AGPL.
>>>>>> People are largely comfortable with GPLv2 software - Linux is
>>>>>> ubiquitous. Many legal departments routinely prohibit GPLv3 software
>>>>>> (we actually saw this when CS was GPLv3 licensed.) But the Affero
>>>>>> license is anathema in many corporate environments, and by forcing
>>>>>> on folks in the default System VM I fear it will hurt adoption of
>>>>>> CloudStack.
>>>>>> --David

View raw message