hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karthik Kambatla (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4691) Cache resource usage at FSLeafQueue level
Date Sun, 14 Feb 2016 00:26:18 GMT

    [ https://issues.apache.org/jira/browse/YARN-4691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15146282#comment-15146282

Karthik Kambatla commented on YARN-4691:

Thanks for filing this, [~mingma]. 

Is the suggestion to update the entire queue hierarchy on FSAppAttempt update? That would
lead to each update being log(number_of_queues_in_hierarchy). Do you think we should check
the ratio of number of app-attempt resource updates vs scheduling resource aggregations? May
be, add more counters to get this info out? 

> Cache resource usage at FSLeafQueue level
> -----------------------------------------
>                 Key: YARN-4691
>                 URL: https://issues.apache.org/jira/browse/YARN-4691
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Ming Ma
> As part of the fair share assignment, fair scheduler needs to sort queues to decide which
queue is furthest away from its fair share. During the sorting, the comparator needs to get
the Resource usage of each queue.
> The parent queue will aggregate the resource usage from leaf queues. The leaf queue will
aggregate the resource usage from all apps in the queue.
> {noformat}
> FSLeafQueue.java
>   @Override
>   public Resource getResourceUsage() {
>     Resource usage = Resources.createResource(0);
>     readLock.lock();
>     try {
>       for (FSAppAttempt app : runnableApps) {
>         Resources.addTo(usage, app.getResourceUsage());
>       }
>       for (FSAppAttempt app : nonRunnableApps) {
>         Resources.addTo(usage, app.getResourceUsage());
>       }
>     } finally {
>       readLock.unlock();
>     }
>     return usage;
>   }
> {noformat}
> Each time fair scheduler tries to assign a container, it needs to sort all queues. Thus
the number of Resources.addTo operations will be (number_of_queues) * lg(number_of_queues)
*  number_of_apps_per_queue, or number_of_apps_on_the_cluster * lg(number_of_queues).
> One way to solve this is to cache the resource usage at FSLeafQueue level. Each time
fair scheduler updates FSAppAttempt's resource usage, it will update FSLeafQueue resource
usage. This will greatly reduce the overall number of Resources.addTo operations.

This message was sent by Atlassian JIRA

View raw message