Return-Path: X-Original-To: apmail-ambari-dev-archive@www.apache.org Delivered-To: apmail-ambari-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C75E418264 for ; Thu, 19 Nov 2015 19:30:27 +0000 (UTC) Received: (qmail 13309 invoked by uid 500); 19 Nov 2015 19:30:27 -0000 Delivered-To: apmail-ambari-dev-archive@ambari.apache.org Received: (qmail 13271 invoked by uid 500); 19 Nov 2015 19:30:27 -0000 Mailing-List: contact dev-help@ambari.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ambari.apache.org Delivered-To: mailing list dev@ambari.apache.org Received: (qmail 13249 invoked by uid 99); 19 Nov 2015 19:30:27 -0000 Received: from reviews-vm.apache.org (HELO reviews.apache.org) (140.211.11.40) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Nov 2015 19:30:27 +0000 Received: from reviews.apache.org (localhost [127.0.0.1]) by reviews.apache.org (Postfix) with ESMTP id 75F862E2759; Thu, 19 Nov 2015 19:30:26 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============2014392596293410362==" MIME-Version: 1.0 Subject: Re: Review Request 40448: Enable auto-start with alerting for AMS From: "Jonathan Hurley" To: "Sumit Mohanty" , "Sid Wagle" , "Jonathan Hurley" Cc: "Ambari" , "Dmytro Sen" Date: Thu, 19 Nov 2015 19:30:26 -0000 Message-ID: <20151119193026.6715.2904@reviews.apache.org> X-ReviewBoard-URL: https://reviews.apache.org/ Auto-Submitted: auto-generated Sender: "Jonathan Hurley" X-ReviewGroup: Ambari X-Auto-Response-Suppress: DR, RN, OOF, AutoReply X-ReviewRequest-URL: https://reviews.apache.org/r/40448/ X-Sender: "Jonathan Hurley" References: <20151118191007.6715.87228@reviews.apache.org> In-Reply-To: <20151118191007.6715.87228@reviews.apache.org> Reply-To: "Jonathan Hurley" X-ReviewRequest-Repository: ambari --===============2014392596293410362== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit > On Nov. 18, 2015, 2:10 p.m., Jonathan Hurley wrote: > > ambari-agent/src/main/python/ambari_agent/FileCache.py, lines 42-43 > > > > > > I'm not so sure that these belong in the file cache - they are directories that contain information pushed to the agents, not data that the agents request. > > Dmytro Sen wrote: > It's convenient to have all the agent tmp directory paths defined in the same place, but I can revert it if you insist. Oh, OK - if you're only putting them here to manage them, then I'm fine with that. I didn't want them included in any caching. > On Nov. 18, 2015, 2:10 p.m., Jonathan Hurley wrote: > > ambari-agent/src/main/python/ambari_agent/RecoveryManager.py, line 840 > > > > > > Always use the temp directory configured in the agent, not hard coded to /tmp. > > Dmytro Sen wrote: > def main(argv=None): > cmd_mgr = RecoveryManager('/tmp') > pass > > > That's main method and it is't called by the agent. It's called only during deployment, agent configuration might not exist. Thanks for explaining. In such cases, should we not use `tempfile.gettempdir()` instead then? > On Nov. 18, 2015, 2:10 p.m., Jonathan Hurley wrote: > > ambari-agent/src/main/python/ambari_agent/alerts/recovery_alert.py, line 82 > > > > > > Should `warned_threshold_reached` be a WARNING here? Or is it truly CRITICAL? > > Dmytro Sen wrote: > warned_threshold_reached means that RecoveryManager won't make any more attempts to recover component's state. That's requires close attention from user, when maximum recovery attempts count reached Sounds good - was just making sure it wasn't a mistake. > On Nov. 18, 2015, 2:10 p.m., Jonathan Hurley wrote: > > ambari-server/src/main/java/org/apache/ambari/server/state/alert/RecoverySource.java, lines 28-30 > > > > > > No need for the empty constructor. > > Dmytro Sen wrote: > I've added it intentionally as a best practice. I won't nitpick you to death :) - Jonathan ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/40448/#review107067 ----------------------------------------------------------- On Nov. 19, 2015, 10:26 a.m., Dmytro Sen wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/40448/ > ----------------------------------------------------------- > > (Updated Nov. 19, 2015, 10:26 a.m.) > > > Review request for Ambari, Jonathan Hurley, Sumit Mohanty, and Sid Wagle. > > > Bugs: AMBARI-13954 > https://issues.apache.org/jira/browse/AMBARI-13954 > > > Repository: ambari > > > Description > ------- > > - In 2.1.3 we have the watch dog script that will shutdown the API if HBase is unresponsive for sometime. > - We also have the ability to auto-start per service / component > - The two should work in conjunction for AMS > - User needs to be alerted if Restarts are too frequent > > Alternative approach is for watch to act as a monitor and be responsible for restarting HBase. > This should still be a alert hook but in that case the alert can be customized for AMS only. > > To turn on AMS auto-start append ambari.properties with > > recovery.type=AUTO_START > recovery.enabled_components=METRICS_COLLECTOR > > > Diffs > ----- > > ambari-agent/src/main/python/ambari_agent/AlertSchedulerHandler.py d3aab87 > ambari-agent/src/main/python/ambari_agent/Controller.py 520d78d > ambari-agent/src/main/python/ambari_agent/FileCache.py 4869e51 > ambari-agent/src/main/python/ambari_agent/RecoveryManager.py cab81f5 > ambari-agent/src/main/python/ambari_agent/alerts/recovery_alert.py PRE-CREATION > ambari-agent/src/test/python/ambari_agent/TestActionQueue.py df8278b > ambari-agent/src/test/python/ambari_agent/TestAlertSchedulerHandler.py a08e4bc > ambari-agent/src/test/python/ambari_agent/TestAlerts.py 1e6da64 > ambari-agent/src/test/python/ambari_agent/TestHeartbeat.py 1f3609d > ambari-agent/src/test/python/ambari_agent/TestRecoveryManager.py e6115e3 > ambari-server/conf/unix/ambari.properties 7f0a464 > ambari-server/src/main/java/org/apache/ambari/server/state/alert/AlertDefinitionFactory.java 4bc25f8 > ambari-server/src/main/java/org/apache/ambari/server/state/alert/RecoverySource.java PRE-CREATION > ambari-server/src/main/java/org/apache/ambari/server/state/alert/SourceType.java 6c1aa9a > ambari-server/src/main/resources/common-services/AMBARI_METRICS/0.1.0/alerts.json 319427d > > Diff: https://reviews.apache.org/r/40448/diff/ > > > Testing > ------- > > Unit tests passed > > > Thanks, > > Dmytro Sen > > --===============2014392596293410362==--