hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sunil G (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3784) Indicate preemption timout along with the list of containers to AM (preemption message)
Date Tue, 30 Jun 2015 15:42:04 GMT

    [ https://issues.apache.org/jira/browse/YARN-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14608543#comment-14608543
] 

Sunil G commented on YARN-3784:
-------------------------------

Thankyou [~chris.douglas] for the comments.
I will update a patch correcting  these problems. Regarding below point,
bq.If containers are preempted for multiple causes (e.g., over-capacity, NM decommission),
then the time to preempt could vary widely
My concern also was same. Currently preemption message will look like below.
{noformat}
 message PreemptionContractProto {
   repeated PreemptionResourceRequestProto resource = 1;
   repeated PreemptionContainerProto container = 2;
+  optional int64 timeout = 3;
 }
message PreemptionContainerProto {
  optional ContainerIdProto id = 1;
}
{noformat}

I have added {{timeout}} per message level. I can try attaching it per container level as
an optional parameter. One potential bottleneck is, different preemption events(ProportionalCPP,
Decommission etc) can come to Application at different time. And {{allocate}} call from ApplicationMasterService
may hit after some secs to fetch "to be preempted" containers. Hence there can be some elapsed
time already lost for few containers. We can subtract and then send to AM, but will it overload
scheduler if many containers are marked for preemption (storing last update time per container
level)?

> Indicate preemption timout along with the list of containers to AM (preemption message)
> ---------------------------------------------------------------------------------------
>
>                 Key: YARN-3784
>                 URL: https://issues.apache.org/jira/browse/YARN-3784
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Sunil G
>            Assignee: Sunil G
>         Attachments: 0001-YARN-3784.patch
>
>
> Currently during preemption, AM is notified with a list of containers which are marked
for preemption. Introducing a timeout duration also along with this container list so that
AM can know how much time it will get to do a graceful shutdown to its containers (assuming
one of preemption policy is loaded in AM).
> This will help in decommissioning NM scenarios, where NM will be decommissioned after
a timeout (also killing containers on it). This timeout will be helpful to indicate AM that
those containers can be killed by RM forcefully after the timeout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message