Return-Path: X-Original-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3F8D5DAAB for ; Fri, 3 Aug 2012 22:07:03 +0000 (UTC) Received: (qmail 12530 invoked by uid 500); 3 Aug 2012 22:07:02 -0000 Delivered-To: apmail-hadoop-common-issues-archive@hadoop.apache.org Received: (qmail 12499 invoked by uid 500); 3 Aug 2012 22:07:02 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-issues@hadoop.apache.org Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 12488 invoked by uid 99); 3 Aug 2012 22:07:02 -0000 Received: from issues-vm.apache.org (HELO issues-vm) (140.211.11.160) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Aug 2012 22:07:02 +0000 Received: from isssues-vm.apache.org (localhost [127.0.0.1]) by issues-vm (Postfix) with ESMTP id A684914284D for ; Fri, 3 Aug 2012 22:07:02 +0000 (UTC) Date: Fri, 3 Aug 2012 22:07:02 +0000 (UTC) From: "Steve Loughran (JIRA)" To: common-issues@hadoop.apache.org Message-ID: <1682854336.12287.1344031622685.JavaMail.jiratomcat@issues-vm> In-Reply-To: <1486340662.12239.1344030782478.JavaMail.jiratomcat@issues-vm> Subject: [jira] [Commented] (HADOOP-8650) /bin/hadoop-daemon.sh to add "-f " arg for forced shutdowns MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HADOOP-8650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13428411#comment-13428411 ] Steve Loughran commented on HADOOP-8650: ---------------------------------------- in HA environments, and other situations, you may want to forcibly shut down a hadoop service -even if it is hung. Currently, hadoop-daemon.sh sends a normal SIGTERM signal -one that the process picks up and reacts to. If the process is completely hung, it is possible that this signal is not acted on, so it stays up. The only way to deal with this is by waiting a while, finding the pid and kill -9'ing it. This must be done by hand, or in an external script. The latter is brittle to changes in HADOOP_PID_DIR values, and requires everyone writing such scripts to code and test it themselves. To replicate this: # start a daemon: {{hadoop-daemon.sh start namenode}} # issue a {{kill -STOP }} to it's PID # try to stop the daemon via the {{hadoop-daemon.sh stop namenode}} command. # observe that the NN process remains present. We could extend hadoop-daemon to support a "-f timeout" argument, which provides a timeout after which the process must be terminated, else a kill -9 signal is issued. > /bin/hadoop-daemon.sh to add "-f " arg for forced shutdowns > --------------------------------------------------------------------- > > Key: HADOOP-8650 > URL: https://issues.apache.org/jira/browse/HADOOP-8650 > Project: Hadoop Common > Issue Type: Improvement > Affects Versions: 1.0.3, 2.2.0-alpha > Reporter: Steve Loughran > > Add a timeout for the daemon script to trigger a kill -9 if the clean shutdown fails. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira