hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shane Kumpf (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-7818) Remove privileged operation warnings during container launch for DefaultLinuxContainerRuntime
Date Thu, 03 May 2018 22:27:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16463145#comment-16463145
] 

Shane Kumpf commented on YARN-7818:
-----------------------------------

Thanks for the review, [~billie.rinaldi]. I agree that these should be consistent. I've attached
a new patch to address that change.

> Remove privileged operation warnings during container launch for DefaultLinuxContainerRuntime
> ---------------------------------------------------------------------------------------------
>
>                 Key: YARN-7818
>                 URL: https://issues.apache.org/jira/browse/YARN-7818
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Yesha Vora
>            Assignee: Shane Kumpf
>            Priority: Major
>         Attachments: YARN-7818.001.patch, YARN-7818.002.patch
>
>
> steps:
>  1) Run Dshell Application
> {code:java}
> yarn  org.apache.hadoop.yarn.applications.distributedshell.Client -jar /usr/hdp/3.0.0.0-751/hadoop-yarn/hadoop-yarn-applications-distributedshell-*.jar
-keep_containers_across_application_attempts -timeout 900000 -shell_command "sleep 110" -num_containers
4{code}
> 2) Find out host where AM is running. 
>  3) Find Containers launched by application
>  4) Restart NM where AM is running
>  5) Validate that new attempt is not started and containers launched before restart are
in RUNNING state.
> In this test, step#5 fails because containers failed to launch with error 143
> {code:java}
> 2018-01-24 09:48:30,547 INFO  container.ContainerImpl (ContainerImpl.java:handle(2108))
- Container container_e04_1516787230461_0001_01_000003 transitioned from RUNNING to KILLING
> 2018-01-24 09:48:30,547 INFO  launcher.ContainerLaunch (ContainerLaunch.java:cleanupContainer(668))
- Cleaning up container container_e04_1516787230461_0001_01_000003
> 2018-01-24 09:48:30,552 WARN  privileged.PrivilegedOperationExecutor (PrivilegedOperationExecutor.java:executePrivilegedOperation(174))
- Shell execution returned exit code: 143. Privileged Execution Operation Stderr:
> Stdout: main : command provided 1
> main : run as user is hrt_qa
> main : requested yarn user is hrt_qa
> Getting exit code file...
> Creating script paths...
> Writing pid file...
> Writing to tmp file /grid/0/hadoop/yarn/local/nmPrivate/application_1516787230461_0001/container_e04_1516787230461_0001_01_000003/container_e04_1516787230461_0001_01_000003.pid.tmp
> Writing to cgroup task files...
> Creating local dirs...
> Launching container...
> Getting exit code file...
> Creating script paths...
> Full command array for failed execution:
> [/usr/hdp/3.0.0.0-751/hadoop-yarn/bin/container-executor, hrt_qa, hrt_qa, 1, application_1516787230461_0001,
container_e04_1516787230461_0001_01_000003, /grid/0/hadoop/yarn/local/usercache/hrt_qa/appcache/application_1516787230461_0001/container_e04_1516787230461_0001_01_000003,
/grid/0/hadoop/yarn/local/nmPrivate/application_1516787230461_0001/container_e04_1516787230461_0001_01_000003/launch_container.sh,
/grid/0/hadoop/yarn/local/nmPrivate/application_1516787230461_0001/container_e04_1516787230461_0001_01_000003/container_e04_1516787230461_0001_01_000003.tokens,
/grid/0/hadoop/yarn/local/nmPrivate/application_1516787230461_0001/container_e04_1516787230461_0001_01_000003/container_e04_1516787230461_0001_01_000003.pid,
/grid/0/hadoop/yarn/local, /grid/0/hadoop/yarn/log, cgroups=none]
> 2018-01-24 09:48:30,553 WARN  runtime.DefaultLinuxContainerRuntime (DefaultLinuxContainerRuntime.java:launchContainer(127))
- Launch container failed. Exception:
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException:
ExitCodeException exitCode=143:
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:180)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.launchContainer(DefaultLinuxContainerRuntime.java:124)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.launchContainer(DelegatingLinuxContainerRuntime.java:152)
>         at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:549)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.launchContainer(ContainerLaunch.java:465)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:285)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:95)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: ExitCodeException exitCode=143:
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:1009)
>         at org.apache.hadoop.util.Shell.run(Shell.java:902)
>         at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1227)
>         at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:152)
>         ... 10 more
> 2018-01-24 09:48:30,553 WARN  nodemanager.LinuxContainerExecutor (LinuxContainerExecutor.java:launchContainer(557))
- Exit code from container container_e04_1516787230461_0001_01_000003 is : 143
> 2018-01-24 09:48:30,582 INFO  containermanager.ContainerManagerImpl (ContainerManagerImpl.java:stopContainerInternal(1365))
- Stopping container with container Id: container_e04_1516787230461_0001_01_000005
> 2018-01-24 09:48:31,093 INFO  impl.TimelineV2ClientImpl (TimelineV2ClientImpl.java:setTimelineCollectorInfo(172))
- Updated timeline service address to xxxxxx:40757
> 2018-01-24 09:48:32,675 INFO  container.ContainerImpl (ContainerImpl.java:handle(2108))
- Container container_e04_1516787230461_0001_01_000003 transitioned from KILLING to CONTAINER_CLEANEDUP_AFTER_KILL{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message