ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hurley" <jhur...@hortonworks.com>
Subject Re: Review Request 32813: Alerts: Generate Server Side Alerts For Agent Health and Alert Staleness
Date Fri, 03 Apr 2015 16:00:44 GMT


> On April 3, 2015, 11:49 a.m., Jeff Sposetti wrote:
> > ambari-server/src/main/resources/alerts.json, line 9
> > <https://reviews.apache.org/r/32813/diff/1/?file=914681#file914681line9>
> >
> >     Looks like a type-o...
> >     
> >     "lost contact" (not "lost contain")

Nice catch! Feel free to review the other text that the alerts produce:
```
All alerts have run within their time intervals.
There are 3 stale alerts from 2 host(s): NameNode Process, DataNode Process, Storm Web UI

c6401.ambari.apache.org is initializing
c6401.ambari.apache.org is healthy
c6401.ambari.apache.org is waiting for status updates
c6401.ambari.apache.org is not sending heartbeats
c6401.ambari.apache.org is not healthy
c6401.ambari.apache.org has an unknown state of FOOBAR
```


- Jonathan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32813/#review78787
-----------------------------------------------------------


On April 3, 2015, 11:29 a.m., Jonathan Hurley wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32813/
> -----------------------------------------------------------
> 
> (Updated April 3, 2015, 11:29 a.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Nate Cole, and Tom Beerbower.
> 
> 
> Bugs: AMBARI-10348
>     https://issues.apache.org/jira/browse/AMBARI-10348
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Due to alerts being run on distributed platforms - we have problem wherein if alert is
scheduled to run on a host that goes down - its not obvious to customer that alerts are not
running and something is wrong. We need to do 2 things:
> 
> 1. Generate an alert when not heard from Ambari Agent for quite sometime
> 2. Fire alerts which havent been run for quite sometime - saying they havent been run
due to host not responding.
> 
> Seems like 2 new alerts are required, both of which being "Server Side" (which is a new
concept).
> 
> - Ambari Server will need to maintain information about the last heartbeat from a host
and produce alerts when a heartbeat has not been received. A new {{@AmbariService}} can handle
this.
> 
> - Ambari Server will need to periodically check the last timestamp of all enabled alert
instances and determine if the alert has not run within a certain period of time. A new {{@AmbariService}}
can handle this.
> 
> We should utilize the alerts.json defined outside the stack since this affects hosts
and alert instants and is not bound to a cluster.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/alerts/AgentHeartbeatAlertRunnable.java
PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/alerts/StaleAlertRunnable.java
PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/api/services/AlertDefinitionService.java
506f911 
>   ambari-server/src/main/java/org/apache/ambari/server/api/services/AlertGroupService.java
a1f1ab4 
>   ambari-server/src/main/java/org/apache/ambari/server/api/services/AlertHistoryService.java
f1855f0 
>   ambari-server/src/main/java/org/apache/ambari/server/api/services/AlertNoticeService.java
1922e2e 
>   ambari-server/src/main/java/org/apache/ambari/server/api/services/AlertService.java
a916c4c 
>   ambari-server/src/main/java/org/apache/ambari/server/api/services/AlertTargetService.java
2a2ecdf 
>   ambari-server/src/main/java/org/apache/ambari/server/api/services/AmbariMetaInfo.java
e87cd57 
>   ambari-server/src/main/java/org/apache/ambari/server/events/listeners/alerts/AlertHostListener.java
d478bf5 
>   ambari-server/src/main/java/org/apache/ambari/server/events/listeners/alerts/AlertMaintenanceModeListener.java
c54baa2 
>   ambari-server/src/main/java/org/apache/ambari/server/metadata/AgentAlertDefinitions.java
af70a51 
>   ambari-server/src/main/java/org/apache/ambari/server/metadata/AmbariServiceAlertDefinitions.java
PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/orm/dao/AlertsDAO.java fd63166

>   ambari-server/src/main/java/org/apache/ambari/server/state/alert/AlertDefinitionFactory.java
43fb450 
>   ambari-server/src/main/java/org/apache/ambari/server/state/alert/AlertDefinitionHash.java
c8b78a0 
>   ambari-server/src/main/java/org/apache/ambari/server/state/alert/ServerSource.java
PRE-CREATION 
>   ambari-server/src/main/java/org/apache/ambari/server/state/alert/SourceType.java 49119d4

>   ambari-server/src/main/java/org/apache/ambari/server/state/services/AmbariServerAlertService.java
PRE-CREATION 
>   ambari-server/src/main/resources/alerts.json 753c29c 
>   ambari-server/src/test/java/org/apache/ambari/server/alerts/AgentHeartbeatAlertRunnableTest.java
PRE-CREATION 
>   ambari-server/src/test/java/org/apache/ambari/server/alerts/StaleAlertRunnableTest.java
PRE-CREATION 
>   ambari-server/src/test/java/org/apache/ambari/server/api/services/AmbariMetaInfoTest.java
a9eff8c 
>   ambari-server/src/test/java/org/apache/ambari/server/events/MockEventListener.java
e9261fa 
>   ambari-server/src/test/java/org/apache/ambari/server/metadata/AgentAlertDefinitionsTest.java
22a9830 
>   ambari-web/app/controllers/main/alerts/definition_configs_controller.js 82844ae 
>   ambari-web/app/mappers/alert_definitions_mapper.js c4679c1 
> 
> Diff: https://reviews.apache.org/r/32813/diff/
> 
> 
> Testing
> -------
> 
> New tests written to cover new Runnables.
> 
> mvn clean test
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message