incubator-ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Sposetti (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (AMBARI-1533) Add Nagios check for ambari-agent process for each host in the cluster
Date Fri, 01 Mar 2013 13:49:12 GMT

     [ https://issues.apache.org/jira/browse/AMBARI-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jeff Sposetti updated AMBARI-1533:
----------------------------------

    Description: 
Each host in the cluster runs ambari-agent.

There should be a Nagios alert to that watches the ambari-agent process. Since the system
does not allow direct communication to an ambari-agent, this check should either the a) process
or b) ping the Ambari Server REST API to confirm agent is still heartbeat'ing.

This alert should be shown with each Hosts > {host} in Ambari Web.

Service Description: Ambari Agent (ambari-agent) process down
Service Group: AMBARI
Check / Retry Interval: 0.25


Note: need to add

  was:
Each host in the cluster runs gmond to emit system-level information to the Ganglia Collector.

There should be a Nagios alert to that watches the gmond process. This alert should be shown
with each Hosts > {host} in Ambari Web.

Service Description: Ganglia Monitor (gmond) process down
Service Group: GANGLIA
Check / Retry Interval: 0.25


    
> Add Nagios check for ambari-agent process for each host in the cluster
> ----------------------------------------------------------------------
>
>                 Key: AMBARI-1533
>                 URL: https://issues.apache.org/jira/browse/AMBARI-1533
>             Project: Ambari
>          Issue Type: Bug
>            Reporter: Jeff Sposetti
>
> Each host in the cluster runs ambari-agent.
> There should be a Nagios alert to that watches the ambari-agent process. Since the system
does not allow direct communication to an ambari-agent, this check should either the a) process
or b) ping the Ambari Server REST API to confirm agent is still heartbeat'ing.
> This alert should be shown with each Hosts > {host} in Ambari Web.
> Service Description: Ambari Agent (ambari-agent) process down
> Service Group: AMBARI
> Check / Retry Interval: 0.25
> Note: need to add

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message