hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Payne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3769) Preemption occurring unnecessarily because preemption doesn't consider user limit
Date Thu, 04 Jun 2015 21:29:38 GMT

    [ https://issues.apache.org/jira/browse/YARN-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14573619#comment-14573619
] 

Eric Payne commented on YARN-3769:
----------------------------------

The following configuration will cause this:

|| queue || capacity || max || pending || used || user limit
| root | 100 | 100 | 40 | 90 | N/A |
| A | 10 | 100 | 20 | 70 | 70 |
| B | 10 | 100 | 20 | 20 | 20 |

One app is running in each queue. Both apps are asking for more resources, but they have each
reached their user limit, so even though both are asking for more and there are resources
available, no more resources are allocated to either app.

The preemption monitor will see that {{B}} is asking for a lot more resources, and it will
see that {{B}} is more underserved than {{A}}, so the preemption monitor will try to make
the queues balance by preempting resources (10, for example) from {{A}}.

|| queue || capacity || max || pending || used || user limit
| root | 100 | 100 | 50 | 80 | N/A |
| A | 10 | 100 | 30 | 60 | 70 |
| B | 10 | 100 | 20 | 20 | 20 |

However, when the capacity scheduler tries to give that container to the app in {{B}}, the
app will recognize that it has no headroom, and refuse the container. So the capacity scheduler
offers the container again to the app in {{A}}, which accepts it because it has headroom now,
and the process starts over again.

Note that this happens even when used cluster resources are below 100% because the used +
pending for the cluster would put it above 100%.

> Preemption occurring unnecessarily because preemption doesn't consider user limit
> ---------------------------------------------------------------------------------
>
>                 Key: YARN-3769
>                 URL: https://issues.apache.org/jira/browse/YARN-3769
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>    Affects Versions: 2.6.0, 2.7.0, 2.8.0
>            Reporter: Eric Payne
>            Assignee: Eric Payne
>
> We are seeing the preemption monitor preempting containers from queue A and then seeing
the capacity scheduler giving them immediately back to queue A. This happens quite often and
causes a lot of churn.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message