ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alejandro Fernandez <afernan...@hortonworks.com>
Subject Re: Review Request 44397: New Alerts Do Not Honor Existing Maintenance Mode Setting
Date Fri, 04 Mar 2016 19:24:31 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44397/#review122136
-----------------------------------------------------------


Ship it!




Ship It!

- Alejandro Fernandez


On March 4, 2016, 6:27 p.m., Jonathan Hurley wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/44397/
> -----------------------------------------------------------
> 
> (Updated March 4, 2016, 6:27 p.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez, Dmitro Lisnichenko, Jayush Luniya, and
Sumit Mohanty.
> 
> 
> Bugs: AMBARI-15303
>     https://issues.apache.org/jira/browse/AMBARI-15303
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Alerts "suppress" maintenance mode by indicating a {{maintenance_state}} attribute in
addition to the actual state which is being reported:
> 
> {code}
>       "Alert": {
>         "cluster_name": "c1",
>         "component_name": "METRICS_COLLECTOR",
>         "definition_id": 43,
>         "definition_name": "ams_metrics_collector_process",
>         "host_name": "c6401.ambari.apache.org",
>         "id": 28,
>         "instance": null,
>         "label": "Metrics Collector Process",
>         "latest_timestamp": 1457108946118,
>         "maintenance_state": "ON",
>         "original_timestamp": 1457108646099,
>         "scope": "ANY",
>         "service_name": "AMBARI_METRICS",
>         "state": "CRITICAL",
>         "text": "Connection failed: [Errno 111] Connection refused to c6401.ambari.apache.org"
>       }
> {code}
> 
> When a host/service/component is placed into MM, the database is updated so that all
{{alert_current}} rows which are affected have their MM updated as well.
> 
> However, this fails under two scenarios:
> - The alert hasn't been received yet in a brand new cluster
> - The alert definition was disabled, which removed all current alerts. Then, it was re-enabled.
> 
> In both cases, when constructing a new {{AlertCurrentEntity}}, we need to calculate the
correct maintenance state.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/MaintenanceStateHelper.java
cd49e76 
>   ambari-server/src/main/java/org/apache/ambari/server/events/listeners/alerts/AlertReceivedListener.java
9bbfe37 
>   ambari-server/src/test/java/org/apache/ambari/server/controller/MaintenanceStateHelperTest.java
d9c5039 
>   ambari-server/src/test/java/org/apache/ambari/server/state/alerts/AlertReceivedListenerTest.java
6e58876 
> 
> Diff: https://reviews.apache.org/r/44397/diff/
> 
> 
> Testing
> -------
> 
> PENDING: Writing UTs and running tests now... 
> 
> Verified fix in an existing cluster by disabling alerts, then re-enabling them on a MM
component with an active alert.
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message