hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "YunFan Zhou (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-6361) FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big queues
Date Fri, 04 Aug 2017 11:18:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-6361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114258#comment-16114258

YunFan Zhou commented on YARN-6361:

[~yufeigu] Thank Yufei.
For this question, including the optimization of the scheduling performance of the FairScheduler,
I have the following ideas, and I apply these ideas to our production environment. The performance
of the scheduling is ideal, and the speed of the assigning container can reach 5000 ~ 10000
per second when aggregate resource requirements for the cluster is high.

Here's what I do:
* Avoid frequent ordering, and it's pointless and a waste of time to do a sequence before
each assign container. Because, after each assignment, the whole child nodes of the queue
are basically staying in order. 
And we don't really need to ensure that all of our fair shares is guaranteed, after all, even
though we do a sort of order before each of the container's assignment because the *FSQueue#demand*
is updated in the last time the *FairScheduler# update* cycle. 
So the value of demand is not real time, which also leads to the fact that we are not strictly
and fairly shared.
So, we can sort all the queues at the *FairScheduler#update* cycle, and we now have a default
of 0.5 s per update cycle, which is worth doing. 
Since we have not been able to make a strict fair share, why don't we sacrifice some of our
semantics of fair scheduler in exchange for better performance?
* Improve the performance of the *Schedulable#getResourceUsage* calculation, making it complex
in O(1).

For one, there are several related smaller but especially useful optimization points. 
But I don't know if you can accept that. 
If you can accept it, I will list a few more detailed points later.

> FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big queues
> --------------------------------------------------------------------------------
>                 Key: YARN-6361
>                 URL: https://issues.apache.org/jira/browse/YARN-6361
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Miklos Szegedi
>            Assignee: Yufei Gu
>            Priority: Minor
>         Attachments: dispatcherthread.png, threads.png
> FSLeafQueue.fetchAppsWithDemand sorts the applications by the current policy. Most of
the time is spent in FairShareComparator.compare. We could improve this by doing the calculations
outside the sort loop {{(O\(n\))}} and we sorted by a fixed number inside instead {{O(n*log\(n\))}}.
This could be an performance issue when there are huge number of applications in a single
queue. The attachments shows the performance impact when there are 10k applications in one

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message