ambari-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Hurley <>
Subject Re: Ambari Server Alerts
Date Mon, 26 Oct 2015 12:11:51 GMT
This alert fires when there are alerts which haven’t reported in within 2x their interval
value. The most common reason that this alert would misfire is that the scheduler on the agents
isn’t able to run the scheduled alert jobs.

We’ve been seeing a problem where the scheduler may occasionally miss the window and the
alert won’t run. This is being fixed for Ambari 2.1.3. In the meantime, you can changed
these lines of code<>
on the agents where the problem is occurring to:

      'apscheduler.threadpool.core_threads': 3,
      'apscheduler.coalesce': True,
      'apscheduler.standalone': False,
      'apscheduler.misfire_grace_time': 5

After restarting the agents, the misfire grace time should be higher and allow for the alert
job to run, even if it misses its window.

On Oct 26, 2015, at 6:31 AM, Vijaya Narayana Reddy Bhoomi Reddy <<>>


In my cluster, I often see “Ambari Server Alerts” with the message “There are 7 stale
alerts from 1 host(s). Resource ManagerRPC Latency, Zookeeper etc”

Can anyone please throw light on the root cause of this alert? I am not able to trace the
correct cause for this. It occurs occasionally and then disappears after a while. However,
it comes with a CRITICAL flag. Hence wanted to understand the root cause behind it and as
well as the severity.


The contents of this e-mail are confidential and for the exclusive use of
the intended recipient. If you receive this e-mail in error please delete
it from your system immediately and notify us either by e-mail or
telephone. You should not copy, forward or otherwise disclose the content
of the e-mail. The views expressed in this communication may not
necessarily be the view held by WHISHWORKS.

View raw message