hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Íñigo Goiri (JIRA) <j...@apache.org>
Subject [jira] [Commented] (YARN-999) In case of long running tasks, reduce node resource should balloon out resource quickly by calling preemption API and suspending running task.
Date Sat, 23 Feb 2019 17:42:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16775957#comment-16775957
] 

Íñigo Goiri commented on YARN-999:
----------------------------------

[^YARN-999.005.patch] includes full coverage in the tests.
Summarizing, the functionality is:
* It just reduces the resources if we don't give a timeout.
* Triggers preemption (notify AM) when we get the change of resources with a timeout.
* Triggers killing in the heartbeats when the timeout is passed.
* It tries to preempt/kill OPPORTUNISTIC containers first, then GUARANTEED and finally AMs
(this is in creation time order).

[~djp], can you take a look and see if there is anything else left?

> In case of long running tasks, reduce node resource should balloon out resource quickly
by calling preemption API and suspending running task. 
> -----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-999
>                 URL: https://issues.apache.org/jira/browse/YARN-999
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: graceful, nodemanager, scheduler
>            Reporter: Junping Du
>            Assignee: Íñigo Goiri
>            Priority: Major
>         Attachments: YARN-291.000.patch, YARN-999.001.patch, YARN-999.002.patch, YARN-999.003.patch,
YARN-999.004.patch, YARN-999.005.patch
>
>
> In current design and implementation, when we decrease resource on node to less than
resource consumption of current running tasks, tasks can still be running until the end. But
just no new task get assigned on this node (because AvailableResource < 0) until some tasks
are finished and AvailableResource > 0 again. This is good for most cases but in case of
long running task, it could be too slow for resource setting to actually work so preemption
could be used here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message