hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sunil G (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-3784) Indicate preemption timout along with the list of containers to AM (preemption message)
Date Wed, 17 Jun 2015 13:58:01 GMT

     [ https://issues.apache.org/jira/browse/YARN-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Sunil G updated YARN-3784:
    Attachment: 0001-YARN-3784.patch

Uploading an initial version.

As per existing preemption framework, AM will fetch the conatiners which are to be preempted
during each allocate call. Along with this, AM can also fetch a proposed possible time duration,
after which RM will forcefully kill those containers.
This patch is not updating preemption timeout  per container level, rather it is giving per
application level. So for all those preempted containers within a heartbeat duration, timeout
will be common. If 2 types of containers are marked for preemption with different timeout,
lowest one will be updated.

If we provide timeout per container level, we need to change the interface of list of containers
to a map <containerId, timeOut>. Please share your thoughts on this point.

> Indicate preemption timout along with the list of containers to AM (preemption message)
> ---------------------------------------------------------------------------------------
>                 Key: YARN-3784
>                 URL: https://issues.apache.org/jira/browse/YARN-3784
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Sunil G
>            Assignee: Sunil G
>         Attachments: 0001-YARN-3784.patch
> Currently during preemption, AM is notified with a list of containers which are marked
for preemption. Introducing a timeout duration also along with this container list so that
AM can know how much time it will get to do a graceful shutdown to its containers (assuming
one of preemption policy is loaded in AM).
> This will help in decommissioning NM scenarios, where NM will be decommissioned after
a timeout (also killing containers on it). This timeout will be helpful to indicate AM that
those containers can be killed by RM forcefully after the timeout.

This message was sent by Atlassian JIRA

View raw message