hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tao Yang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (YARN-8771) CapacityScheduler fails to unreserve when cluster resource contains empty resource type
Date Thu, 13 Sep 2018 10:10:00 GMT
Tao Yang created YARN-8771:
------------------------------

             Summary: CapacityScheduler fails to unreserve when cluster resource contains
empty resource type
                 Key: YARN-8771
                 URL: https://issues.apache.org/jira/browse/YARN-8771
             Project: Hadoop YARN
          Issue Type: Bug
          Components: capacityscheduler
    Affects Versions: 3.2.0
            Reporter: Tao Yang
            Assignee: Tao Yang


We found this problem when cluster is almost but not exhausted (93% used), scheduler kept
allocating for an app but always fail to commit, this can blocking requests from other apps
and parts of cluster resource can't be used.

Reproduce this problem:
(1) use DominantResourceCalculator
(2) cluster resource has empty resource type, for example: gpu=0
(3) scheduler allocates container for app1 who has reserved containers and whose queue limit
or user limit reached(used + required > limit). 

Reference codes in RegularContainerAllocator#assignContainer:
{code:java}
    boolean needToUnreserve =
        Resources.greaterThan(rc, clusterResource,
            resourceNeedToUnReserve, Resources.none());
{code}
value of resourceNeedToUnReserve can be <8GB, -6 cores, 0 gpu>, result of {{Resources#greaterThan}}
will be false if using DominantResourceCalculator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message