hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ram Venkatesh (JIRA)" <j...@apache.org>
Subject [jira] [Created] (YARN-2362) Capacity Scheduler apps with requests that exceed capacity can starve pending apps
Date Sat, 26 Jul 2014 16:40:38 GMT
Ram Venkatesh created YARN-2362:
-----------------------------------

             Summary: Capacity Scheduler apps with requests that exceed capacity can starve
pending apps
                 Key: YARN-2362
                 URL: https://issues.apache.org/jira/browse/YARN-2362
             Project: Hadoop YARN
          Issue Type: Bug
          Components: capacityscheduler
    Affects Versions: 2.4.1
            Reporter: Ram Venkatesh


Cluster configuration:
Total memory: 8GB
yarn.scheduler.minimum-allocation-mb 256
yarn.scheduler.capacity.maximum-am-resource-percent 1 (100%, test only config)

App 1 makes a request for 4.6 GB, succeeds, app transitions to RUNNING state. It subsequently
makes a request for 4.6 GB, which cannot be granted and it waits.

App 2 makes a request for 1 GB - never receives it, so the app stays in the ACCEPTED state
for ever.

I think this can happen in leaf queues that are near capacity.

The fix is likely in LeafQueue.java assignContainers near line 861, where it returns if the
assignment would exceed queue capacity, instead of checking if requests for other active applications
can be met.

           // Check queue max-capacity limit
           if (!assignToQueue(clusterResource, required)) {
-            return NULL_ASSIGNMENT;
+            break;
           }

With this change, the scenario above allows App 2 to start and finish while App 1 continues
to wait.

I have a patch available, but wondering if the current behavior is by design.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message