hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Naganarasimha G R (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3784) Indicate preemption timout along with the list of containers to AM (preemption message)
Date Thu, 19 Nov 2015 11:07:11 GMT

    [ https://issues.apache.org/jira/browse/YARN-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15013343#comment-15013343
] 

Naganarasimha G R commented on YARN-3784:
-----------------------------------------

Hi [~sunilg], Had one thought, IIUC default value is 15 seconds but not sure in real cluster
same will be used and what heartbeat interval for apps will be configured, suppose its very
low {{WAIT_TIME_BEFORE_KILL}} and high {{AM heartbeat interval}}, then by the time AM is informed
about the preemption timeout, may be it will not be able to take the decision effectively.
So how would it be to support long timestamp indicating when it will be preempted, thoughts
? 
Sorry to pitch in at the last moment !

> Indicate preemption timout along with the list of containers to AM (preemption message)
> ---------------------------------------------------------------------------------------
>
>                 Key: YARN-3784
>                 URL: https://issues.apache.org/jira/browse/YARN-3784
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Sunil G
>            Assignee: Sunil G
>         Attachments: 0001-YARN-3784.patch, 0002-YARN-3784.patch, 0003-YARN-3784.patch,
0004-YARN-3784.patch
>
>
> Currently during preemption, AM is notified with a list of containers which are marked
for preemption. Introducing a timeout duration also along with this container list so that
AM can know how much time it will get to do a graceful shutdown to its containers (assuming
one of preemption policy is loaded in AM).
> This will help in decommissioning NM scenarios, where NM will be decommissioned after
a timeout (also killing containers on it). This timeout will be helpful to indicate AM that
those containers can be killed by RM forcefully after the timeout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message