hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yufei Gu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-6361) FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big queues
Date Wed, 16 Aug 2017 23:10:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-6361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16129592#comment-16129592
] 

Yufei Gu commented on YARN-6361:
--------------------------------

Hi [~daemon], thanks for the patch. Let's talk about the idea first. 
{quote}
Avoid frequent ordering, and it's pointless and a waste of time to do a sequence before each
assign container. Because, after each assignment, the whole child nodes of the queue are basically
staying in order.
{quote}
Agreed basically, but it's the right thing to do until we find a solution to either avoid
sorting every time and guarantee fairness between queues.
{quote}
And we don't really need to ensure that all of our fair shares is guaranteed, after all, even
though we do a sort of order before each of the container's assignment because the FSQueue#demand
is updated in the last time the FairScheduler# update cycle. 
So the value of demand is not real time, which also leads to the fact that we are not strictly
and fairly shared.
So, we can sort all the queues at the FairScheduler#update cycle, and we now have a default
of 0.5 s per update cycle, which is worth doing.
{quote}
We should ensure fairness as possible as we can. Demand doesn't play a big role in {{FairShareComparator}}.
It only matters while resource usage is less than min share. In addition, applications like
MR don't change their demands frequently. In most cases, there are only 2 times for a job,
which is asking resources for all Mapper and all Reducer. 

The community is working on the global scheduling, which will mitigate this performance issue
dramatically in different ways.

BTW, your second point is legit. YARN-4090 is quiet for a while, you are welcome to contribute
if you want.





> FairScheduler: FSLeafQueue.fetchAppsWithDemand CPU usage is high with big queues
> --------------------------------------------------------------------------------
>
>                 Key: YARN-6361
>                 URL: https://issues.apache.org/jira/browse/YARN-6361
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Miklos Szegedi
>            Assignee: YunFan Zhou
>         Attachments: dispatcherthread.png, threads.png, YARN-6361.001.patch
>
>
> FSLeafQueue.fetchAppsWithDemand sorts the applications by the current policy. Most of
the time is spent in FairShareComparator.compare. We could improve this by doing the calculations
outside the sort loop {{(O\(n\))}} and we sorted by a fixed number inside instead {{O(n*log\(n\))}}.
This could be an performance issue when there are huge number of applications in a single
queue. The attachments shows the performance impact when there are 10k applications in one
queue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message