hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sunil G (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2009) Priority support for preemption in ProportionalCapacityPreemptionPolicy
Date Wed, 12 Nov 2014 16:19:35 GMT

    [ https://issues.apache.org/jira/browse/YARN-2009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14208207#comment-14208207

Sunil G commented on YARN-2009:

I agree with your thoughts [~curino]. Locality constraints based policy making is as you told
hypothetical, and with a given set of test experiments we can see how far its adding value
.. I am devising and working on some useful tests to see the advantage. However I also felt
that this added thought may help cluster to work in better way. But now it seems more complicated
as weightage of choosing which container, is not balanced or straight forward while considering
all scenarios.

a. Higher priority application needs 7 containers
b. 2 apps in Lower priority has 4 containers(2 each), and 2 apps at Very low priority has
4 containers (2 each).

Possible behavior from preemption policy can be:
1. Spare AM containers (Based on config)
2. At Very Low priority, choose application which is last submitted and claim 2 containers.
Then the next app at same level.

This may be the direct output we expect.

However, few thoughts
1. higher priority app may need containers on certain nodes(locality), but the preemption
happened on other nodes, and thus make a choice of rack local or even any. 
2. With node labels, its even possible that the preempted containers fall into another set
of label on which the demand can't be supplied.
3. User limit factor has to be respected during preemption (queue preemption considers this
already with a config)
4. A different example, higher priority application needs 2 container of 6GB each. 1 lower
priority application has 12 containers of 1Gb each, another lower priority has 2 container
of 6Gb each. With submission time, if we choose 1st lower priority app, we may kill more containers.
Sometimes a wiser choice is to select 2nd one. This is debatable :)
5. Taking first example itself. we have 2 lower priority apps to choose from, but based on
submission time 1st app is selected for preemption. Its possible that this app may be more
i/o bounded and finished more % of work than 2nd one which is submitted earlier. So submission
time alone may not be a good choice, % of job completion can be considered.

My point is being there is no single line in which a decision can be made sequentially, few
customers may be option 2 over option1. Hence a policy with lot of config may come in if we
accept these as feature. I would like to get your thoughts on this, as you told this may not
give a big output, but only can work as enhancement.

> Priority support for preemption in ProportionalCapacityPreemptionPolicy
> -----------------------------------------------------------------------
>                 Key: YARN-2009
>                 URL: https://issues.apache.org/jira/browse/YARN-2009
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacityscheduler
>            Reporter: Devaraj K
>            Assignee: Sunil G
> While preempting containers based on the queue ideal assignment, we may need to consider
preempting the low priority application containers first.

This message was sent by Atlassian JIRA

View raw message