hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yufei Gu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-6793) Duplicated reservation in Fair Scheduler preemption
Date Mon, 10 Jul 2017 17:47:00 GMT

     [ https://issues.apache.org/jira/browse/YARN-6793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Yufei Gu updated YARN-6793:
---------------------------
    Description: 
There is a delay between preemption happen and containers are killed. If resources released
from nodes before container killing are not enough for the resource request preemption asking
for, reservation happens again at that node.
E.g. scheduler reserves <memory 2048, vcore 2> in node 1 for app 1. It will take 15s
by default to kill containers in node 1 for fulfill that resource requests. If <memory
1024, vcore 1> was released from node 1 before the killing, scheduler reserves <memory
2048, vcore 2> again in node 1 for app1. The second reservation may never be unreserved.


  was:
There is a delay between preemption happen and containers are killed. If some resources released
from nodes which are supposed to be preempted at that time are not enough for the resource
request, reservation happens again at that node.
E.g. scheduler reserves <memory 2048, vcore 2> in node 1 for app 1. It will take 15s
by default to kill containers in node 1 for fulfill that resource requests. If <memory
1024, vcore 1> was released from node 1 before the killing, scheduler reserves <memory
2048, vcore 2> again in node 1 for app1. The second reservation may never be unreserved.



> Duplicated reservation in Fair Scheduler preemption 
> ----------------------------------------------------
>
>                 Key: YARN-6793
>                 URL: https://issues.apache.org/jira/browse/YARN-6793
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: fairscheduler
>    Affects Versions: 2.8.1, 3.0.0-alpha3
>            Reporter: Yufei Gu
>            Assignee: Yufei Gu
>            Priority: Critical
>
> There is a delay between preemption happen and containers are killed. If resources released
from nodes before container killing are not enough for the resource request preemption asking
for, reservation happens again at that node.
> E.g. scheduler reserves <memory 2048, vcore 2> in node 1 for app 1. It will take
15s by default to kill containers in node 1 for fulfill that resource requests. If <memory
1024, vcore 1> was released from node 1 before the killing, scheduler reserves <memory
2048, vcore 2> again in node 1 for app1. The second reservation may never be unreserved.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message