hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Abin Shahab (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3080) The DockerContainerExecutor could not write the right pid to container pidFile
Date Sat, 28 Feb 2015 20:10:05 GMT

    [ https://issues.apache.org/jira/browse/YARN-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14341761#comment-14341761
] 

Abin Shahab commented on YARN-3080:
-----------------------------------

[~chenchun], Also, other threads in the NM may depend on the actual PID of the launched JVM,
and I'm not going to preclude any future processes from depending upon this. The pidfile is
supposed to be populated right after the container is launched, not after the process has
finished. The DockerContainerExecutor will not alter these fundamental api and expectations
of the NM from a container executor. Therefore, getting the actual PID from docker inspect
is the most accurate and recommended way.


> The DockerContainerExecutor could not write the right pid to container pidFile
> ------------------------------------------------------------------------------
>
>                 Key: YARN-3080
>                 URL: https://issues.apache.org/jira/browse/YARN-3080
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.6.0
>            Reporter: Beckham007
>            Assignee: Abin Shahab
>         Attachments: YARN-3080.patch, YARN-3080.patch, YARN-3080.patch, YARN-3080.patch
>
>
> The docker_container_executor_session.sh is like this:
> {quote}
> #!/usr/bin/env bash
> echo `/usr/bin/docker inspect --format {{.State.Pid}} container_1421723685222_0008_01_000002`
> /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_000002/container_1421723685222_0008_01_000002.pid.tmp
> /bin/mv -f /data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_000002/container_1421723685222_0008_01_000002.pid.tmp
/data/nm_restart/hadoop-2.4.1/data/yarn/local/nmPrivate/application_1421723685222_0008/container_1421723685222_0008_01_000002/container_1421723685222_0008_01_000002.pid
> /usr/bin/docker run --rm  --name container_1421723685222_0008_01_000002 -e GAIA_HOST_IP=c162
-e GAIA_API_SERVER=10.6.207.226:8080 -e GAIA_CLUSTER_ID=shpc-nm_restart -e GAIA_QUEUE=root.tdwadmin
-e GAIA_APP_NAME=test_nm_docker -e GAIA_INSTANCE_ID=1 -e GAIA_CONTAINER_ID=container_1421723685222_0008_01_000002
--memory=32M --cpu-shares=1024 -v /data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_000002:/data/nm_restart/hadoop-2.4.1/data/yarn/container-logs/application_1421723685222_0008/container_1421723685222_0008_01_000002
-v /data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_000002:/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_000002
-P -e A=B --privileged=true docker.oa.com:8080/library/centos7 bash "/data/nm_restart/hadoop-2.4.1/data/yarn/local/usercache/tdwadmin/appcache/application_1421723685222_0008/container_1421723685222_0008_01_000002/launch_container.sh"
> {quote}
> The DockerContainerExecutor use docker inspect before docker run, so the docker inspect
couldn't get the right pid for the docker, signalContainer() and nm restart would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message