hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3769) Preemption occurring unnecessarily because preemption doesn't consider user limit
Date Thu, 04 Jun 2015 21:44:38 GMT

    [ https://issues.apache.org/jira/browse/YARN-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14573638#comment-14573638
] 

Wangda Tan commented on YARN-3769:
----------------------------------

[~eepayne],
This is a very interesting problem, actually not only user-limit causes it.

For example, fair ordering (YARN-3306), hard locality requirements (I want resources from
rackA and nodeX only), AM resource limit; In the near future we can have constraints (YARN-3409),
all can lead to resource is preempted from one queue, but the other queue cannot use it because
of specific resource requirement and limits.

One thing I've thought for a while is adding a "lazy preemption" mechanism, which is: when
a container is marked preempted and wait for max_wait_before_time, it becomes a "can_be_killed"
container. If there's another queue can allocate on a node with "can_be_killed" container,
such container will be killed immediately to make room the new containers.

This mechanism can make preemption policy doesn't need to consider complex resource requirements
and limits inside a queue, and also it can avoid kill unnecessary containers.

If you think it's fine, could I take a shot at it?

Thoughts? [~vinodkv].

> Preemption occurring unnecessarily because preemption doesn't consider user limit
> ---------------------------------------------------------------------------------
>
>                 Key: YARN-3769
>                 URL: https://issues.apache.org/jira/browse/YARN-3769
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>    Affects Versions: 2.6.0, 2.7.0, 2.8.0
>            Reporter: Eric Payne
>            Assignee: Eric Payne
>
> We are seeing the preemption monitor preempting containers from queue A and then seeing
the capacity scheduler giving them immediately back to queue A. This happens quite often and
causes a lot of churn.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message