Return-Path: X-Original-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CD1417E29 for ; Sun, 23 Oct 2011 02:21:52 +0000 (UTC) Received: (qmail 37642 invoked by uid 500); 23 Oct 2011 02:21:52 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 37601 invoked by uid 500); 23 Oct 2011 02:21:52 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 37593 invoked by uid 99); 23 Oct 2011 02:21:52 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 23 Oct 2011 02:21:52 +0000 X-ASF-Spam-Status: No, hits=-2000.5 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 23 Oct 2011 02:21:51 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 2CCE83173C7 for ; Sun, 23 Oct 2011 02:19:32 +0000 (UTC) Date: Sun, 23 Oct 2011 02:19:32 +0000 (UTC) From: "Hitesh Shah (Updated) (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: <82306664.6132.1319336372185.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1238250715.1192.1319207912891.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (MAPREDUCE-3240) NM should send a SIGKILL for completed containers also MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAPREDUCE-3240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Shah updated MAPREDUCE-3240: ----------------------------------- Attachment: MR-3240.wip.patch Patch does the following: - introduced sending a sigterm followed by a sigkill when cleaning up a container - new config settings introduced for the delay between sigterm and sigkill - introduced activeContainers within the ContainerExecutor. Used by the launcher to set whether a container should be launched or not. If cleanup is called before the process starts, this flag ensures that the process is never started. Addresses race-kill issue in MR-3084 - Getting the pid after the shell executor has completed is unreliable so now task.sh writes the pid into a local file which can be read by the containerlauncher and used to kill the process. > NM should send a SIGKILL for completed containers also > ------------------------------------------------------ > > Key: MAPREDUCE-3240 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3240 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2, nodemanager > Affects Versions: 0.23.0 > Reporter: Vinod Kumar Vavilapalli > Assignee: Hitesh Shah > Attachments: MR-3240.wip.patch > > > This is to address the containers which exit properly after spawning sub-processes themselves. We don't want to leave these sub-process-tree or else they can pillage the NM's resources. > Today, we already have code to send SIGKILL to the whole process-trees (because of single sessionId resulting from setsid) when the container is alive. We need to obtain the PID of the containers when they start and use that PID to send signal for completed containers' case also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira