hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shane Kumpf (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-5366) Improve handling of the Docker container life cycle
Date Tue, 26 Dec 2017 19:26:02 GMT

     [ https://issues.apache.org/jira/browse/YARN-5366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shane Kumpf updated YARN-5366:
------------------------------
    Attachment: YARN-5366.010.patch

While working on YARN-6305, I found that the changes were minimal and attempting to implement
it in a separate patch would just lead to merge conflicts until this is committed. As a result,
I've added that handling to this latest patch.

> Improve handling of the Docker container life cycle
> ---------------------------------------------------
>
>                 Key: YARN-5366
>                 URL: https://issues.apache.org/jira/browse/YARN-5366
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: yarn
>            Reporter: Shane Kumpf
>            Assignee: Shane Kumpf
>              Labels: oct16-medium
>         Attachments: YARN-5366.001.patch, YARN-5366.002.patch, YARN-5366.003.patch, YARN-5366.004.patch,
YARN-5366.005.patch, YARN-5366.006.patch, YARN-5366.007.patch, YARN-5366.008.patch, YARN-5366.009.patch,
YARN-5366.010.patch
>
>
> There are several paths that need to be improved with regard to the Docker container
lifecycle when running Docker containers on YARN.
> 1) Provide the ability to keep a container on the NodeManager for a set period of time
for debugging purposes.
> 2) Support sending signals to the process in the container to allow for triggering stack
traces, heap dumps, etc.
> 3) Support for Docker's live restore, which means moving away from the use of {{docker
wait}}. (YARN-5818)
> 4) Improve the resiliency of liveliness checks (kill -0) by adding retries.
> 5) Improve the resiliency of container removal by adding retries.
> 6) Only attempt to stop, kill, and remove containers if the current container state allows
for it.
> 7) Better handling of short lived containers when the container is stopped before the
PID can be retrieved. (YARN-6305)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message