Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 106591094E for ; Mon, 3 Nov 2014 18:58:35 +0000 (UTC) Received: (qmail 33108 invoked by uid 500); 3 Nov 2014 18:58:34 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 33056 invoked by uid 500); 3 Nov 2014 18:58:34 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 33044 invoked by uid 99); 3 Nov 2014 18:58:34 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Nov 2014 18:58:34 +0000 Date: Mon, 3 Nov 2014 18:58:34 +0000 (UTC) From: "Billie Rinaldi (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (YARN-1922) Process group remains alive after container process is killed externally MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-1922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Billie Rinaldi updated YARN-1922: --------------------------------- Attachment: YARN-1922.6.patch Attaching a new patch. Instead of using do/while (!completed.get()), this patch simply uses while(true), so that it always loops until the pid file appears or the maxKillWaitTime elapses. [~vinodkv], does this address your concerns? > Process group remains alive after container process is killed externally > ------------------------------------------------------------------------ > > Key: YARN-1922 > URL: https://issues.apache.org/jira/browse/YARN-1922 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Affects Versions: 2.4.0 > Environment: CentOS 6.4 > Reporter: Billie Rinaldi > Assignee: Billie Rinaldi > Attachments: YARN-1922.1.patch, YARN-1922.2.patch, YARN-1922.3.patch, YARN-1922.4.patch, YARN-1922.5.patch, YARN-1922.6.patch > > > If the main container process is killed externally, ContainerLaunch does not kill the rest of the process group. Before sending the event that results in the ContainerLaunch.containerCleanup method being called, ContainerLaunch sets the "completed" flag to true. Then when cleaning up, it doesn't try to read the pid file if the completed flag is true. If it read the pid file, it would proceed to send the container a kill signal. In the case of the DefaultContainerExecutor, this would kill the process group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)