hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "YunFan Zhou (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-6361) FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big queues
Date Fri, 04 Aug 2017 11:22:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-6361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16114258#comment-16114258
] 

YunFan Zhou edited comment on YARN-6361 at 8/4/17 11:21 AM:
------------------------------------------------------------

[~yufeigu] Thank Yufei.
For this question, including the optimization of the scheduling performance of the FairScheduler,
I have the following ideas, and I apply these ideas to our production environment. The performance
of the scheduling is ideal, and the speed of the assigning container can reach 5000 ~ 10000
per second when aggregate resource requirements for the cluster is high.

Here's what I do:
* Avoid frequent ordering, and it's pointless and a waste of time to do a sequence before
each assign container. Because, after each assignment, the whole child nodes of the queue
are basically staying in order. 
And we don't really need to ensure that all of our fair shares is guaranteed, after all, even
though we do a sort of order before each of the container's assignment because the *FSQueue#demand*
is updated in the last time the *FairScheduler# update* cycle. 
So the value of demand is not real time, which also leads to the fact that we are not strictly
and fairly shared.
So, we can sort all the queues at the *FairScheduler#update* cycle, and we now have a default
of 0.5 s per update cycle, which is worth doing. 
Since we have not been able to make a strict fair share, why don't we sacrifice some of our
semantics of fair scheduler in exchange for better performance?
* Improve the performance of the *Schedulable#getResourceUsage* calculation, making it complex
in O(1).

For one, there are several related smaller but especially useful optimization points. 
And we can guarantee that the cost of assigning a container is at O(1) complexity.
But I don't know if you can accept that. If you can accept it, I will list a few more detailed
points later.



was (Author: daemon):
[~yufeigu] Thank Yufei.
For this question, including the optimization of the scheduling performance of the FairScheduler,
I have the following ideas, and I apply these ideas to our production environment. The performance
of the scheduling is ideal, and the speed of the assigning container can reach 5000 ~ 10000
per second when aggregate resource requirements for the cluster is high.

Here's what I do:
* Avoid frequent ordering, and it's pointless and a waste of time to do a sequence before
each assign container. Because, after each assignment, the whole child nodes of the queue
are basically staying in order. 
And we don't really need to ensure that all of our fair shares is guaranteed, after all, even
though we do a sort of order before each of the container's assignment because the *FSQueue#demand*
is updated in the last time the *FairScheduler# update* cycle. 
So the value of demand is not real time, which also leads to the fact that we are not strictly
and fairly shared.
So, we can sort all the queues at the *FairScheduler#update* cycle, and we now have a default
of 0.5 s per update cycle, which is worth doing. 
Since we have not been able to make a strict fair share, why don't we sacrifice some of our
semantics of fair scheduler in exchange for better performance?
* Improve the performance of the *Schedulable#getResourceUsage* calculation, making it complex
in O(1).

For one, there are several related smaller but especially useful optimization points. 
But I don't know if you can accept that. 
If you can accept it, I will list a few more detailed points later.


> FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big queues
> --------------------------------------------------------------------------------
>
>                 Key: YARN-6361
>                 URL: https://issues.apache.org/jira/browse/YARN-6361
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Miklos Szegedi
>            Assignee: YunFan Zhou
>            Priority: Minor
>         Attachments: dispatcherthread.png, threads.png
>
>
> FSLeafQueue.fetchAppsWithDemand sorts the applications by the current policy. Most of
the time is spent in FairShareComparator.compare. We could improve this by doing the calculations
outside the sort loop {{(O\(n\))}} and we sorted by a fixed number inside instead {{O(n*log\(n\))}}.
This could be an performance issue when there are huge number of applications in a single
queue. The attachments shows the performance impact when there are 10k applications in one
queue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message