hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Billie Rinaldi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1944) Application Container commands fail to stop when application is killed
Date Tue, 15 Apr 2014 18:29:16 GMT

    [ https://issues.apache.org/jira/browse/YARN-1944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13969861#comment-13969861
] 

Billie Rinaldi commented on YARN-1944:
--------------------------------------

Having looked into YARN-1922 recently, I wouldn't have expected to see processes staying around
after containers were killed by Yarn.  What OS are you using?  Does the "setsid" command exist?
 I think Yarn only kills the entire process group if setsid is available.

> Application Container commands fail to stop when application is killed
> ----------------------------------------------------------------------
>
>                 Key: YARN-1944
>                 URL: https://issues.apache.org/jira/browse/YARN-1944
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.3.0
>            Reporter: Oleg Zhurakousky
>
> When launching Yarn Application with an infinite command (e.g., ping google.com), Application
Container stops while command(s) continues to run.
> For example:
> Command: ping google.com; 4 containers
> Submit app:
> {code}
> ApplicationId appId = this.yarnClient.submitApplication(appContext);
> {code}
> Kill app:
> {code}
> this.yarnClient.killApplication(appId);
> {code}
> Produces the following output:
> {code}
> 13:10:22,017 ERROR IPC Server handler 48 on 8035 resourcemanager.ApplicationMasterService:328
- Application doesn't exist in cache appattempt_1397581697363_0002_000001
> {code}
> Why is it telling me that it doesn't exist when I am using the same AppId that was returned
by the YarnClient?
> Also, I can see that after the kill the actual application containers stopped:
> {code}
> 13:10:22,128  WARN ContainersLauncher #6 nodemanager.DefaultContainerExecutor:207 - Exit
code from container container_1397581697363_0002_01_000002 is : 143
> 13:10:22,151  WARN ContainersLauncher #7 nodemanager.DefaultContainerExecutor:207 - Exit
code from container container_1397581697363_0002_01_000003 is : 143
> 13:10:22,175  WARN ContainersLauncher #8 nodemanager.DefaultContainerExecutor:207 - Exit
code from container container_1397581697363_0002_01_000004 is : 143
> 13:10:22,198  WARN ContainersLauncher #9 nodemanager.DefaultContainerExecutor:207 - Exit
code from container container_1397581697363_0002_01_000005 is : 143
> {code}
> Meanwhile I have 4 pings running.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message