hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-12441) Fix kill command execution under Ubuntu 12
Date Mon, 28 Sep 2015 04:51:04 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-12441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14910032#comment-14910032

Wangda Tan commented on HADOOP-12441:

Some proposals of this issue: 
Proposal#1. Use "bash \-c" to execute the kill command so it will pick the bash built-in kill?
The problem is, do we have other Linux distribution doesn't have bash support? Since we uses
bash in ContainerLaunch, it shouldn't be a big problem. Another problem we may need to double
check is: Some bash-kill could also violate the "--" POSIX recommendation like bin/kill

Proposal#2. Choose different kill args depends on different version of OS/kill.

Proposal#3. Run check command similar to what we did for "setsid" (for example: "kill -0 \-\-
-1"). Choose different kill args depends on exit code.

Any suggestions?


> Fix kill command execution under Ubuntu 12
> ------------------------------------------
>                 Key: HADOOP-12441
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12441
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Wangda Tan
>            Priority: Critical
> After HADOOP-12317, kill command's execution will be failure under Ubuntu12. After NM
restarts, it cannot get if a process is alive or not via pid of containers, and it cannot
kill process correctly when RM/AM tells NM to kill a container.
> Logs from NM (customized logs):
> {code}
> 2015-09-25 21:58:59,348 INFO  nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:containerIsAlive(431))
-  ================== check alive cmd:[[Ljava.lang.String;@496e442d]
> 2015-09-25 21:58:59,349 INFO  nodemanager.NMAuditLogger (NMAuditLogger.java:logSuccess(89))
- USER=hrt_qa       IP=    OPERATION=Stop Container Request        TARGET=ContainerManageImpl
     RESULT=SUCCESS  APPID=application_1443218269460_0001    CONTAINERID=container_1443218269460_0001_01_000001
> 2015-09-25 21:58:59,363 INFO  nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:containerIsAlive(438))
-  ===========================
> ExitCodeException exitCode=1: ERROR: garbage process ID "--".
> Usage:
>   kill pid ...              Send SIGTERM to every process listed.
>   kill signal pid ...       Send a signal to every process listed.
>   kill -s signal pid ...    Send a signal to every process listed.
>   kill -l                   List all signal names.
>   kill -L                   List all signal names in a nice table.
>   kill -l signal            Convert between signal numbers and names.
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:550)
>         at org.apache.hadoop.util.Shell.run(Shell.java:461)
>         at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:727)
>         at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.containerIsAlive(DefaultContainerExecutor.java:432)
>         at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.signalContainer(DefaultContainerExecutor.java:401)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:419)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:139)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:55)
>         at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:175)
>         at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:108)
>         at java.lang.Thread.run(Thread.java:745)
> {code}

This message was sent by Atlassian JIRA

View raw message