hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Varun Vasudev (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1922) Process group remains alive after container process is killed externally
Date Sat, 01 Nov 2014 19:10:34 GMT

    [ https://issues.apache.org/jira/browse/YARN-1922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14193368#comment-14193368
] 

Varun Vasudev commented on YARN-1922:
-------------------------------------

Thanks for the patch Billie. Some comments -
{noformat}
+    if (Shell.WINDOWS) {
+      return;
+    }
{noformat}
Is it possible to change this to -
{noformat}
+ Assume.assumeTrue(Shell.isSetsidAvailable);
{noformat}
As I understand it, the functionality to kill child processes on non-Windows platforms requires
the setsid command.

Also, you check if the parent process is killed here - 
{noformat}
+    Assert.assertFalse("Process is still alive!",
+        DefaultContainerExecutor.containerIsAlive(pid));
{noformat}
Shouldn't we also check if the child process has been killed? On my Mac for example(after
applying your patch) -
{noformat}
HWxxxxx:hadoop-yarn-server-nodemanager vvasudev$ setsid
-bash: setsid: command not found
HWxxxxx:hadoop-yarn-server-nodemanager vvasudev$ mvn clean test -Dtest=TestContainerLaunch#testKillProcessGroup
-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.215 sec - in org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 11.908s
[INFO] Finished at: Sun Nov 02 00:37:18 IST 2014
[INFO] Final Memory: 25M/306M
[INFO] ------------------------------------------------------------------------
HWxxxxx:hadoop-yarn-server-nodemanager vvasudev$ ps aux | grep bash | grep testscript
vvasudev        37129   0.0  0.0  2436436    536 s002  S    12:37AM   0:00.01 /bin/bash /Users/vvasudev/Hortonworks/apache/hadoop/git/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/TestContainerLaunch-tmpDir/testscript.sh
{noformat}

> Process group remains alive after container process is killed externally
> ------------------------------------------------------------------------
>
>                 Key: YARN-1922
>                 URL: https://issues.apache.org/jira/browse/YARN-1922
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.4.0
>         Environment: CentOS 6.4
>            Reporter: Billie Rinaldi
>            Assignee: Billie Rinaldi
>         Attachments: YARN-1922.1.patch, YARN-1922.2.patch, YARN-1922.3.patch, YARN-1922.4.patch
>
>
> If the main container process is killed externally, ContainerLaunch does not kill the
rest of the process group.  Before sending the event that results in the ContainerLaunch.containerCleanup
method being called, ContainerLaunch sets the "completed" flag to true.  Then when cleaning
up, it doesn't try to read the pid file if the completed flag is true.  If it read the pid
file, it would proceed to send the container a kill signal.  In the case of the DefaultContainerExecutor,
this would kill the process group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message