hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tao Yang (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-7005) Skip unnecessary sorting and iterating process for child queues without pending resource to optimize schedule performance
Date Fri, 05 Jan 2018 12:28:00 GMT

     [ https://issues.apache.org/jira/browse/YARN-7005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tao Yang updated YARN-7005:
---------------------------
    Attachment: YARN-7005.003.patch

Attaching v3 patch. [~leftnoteasy], [~sunilg], please help to review in your free time, Thanks.
Updates:
* maintain demand queues for every parent queue to improve the scheduling performance: (1)
Update demand queues (just add) for parent queues when app request is updated in CapacityScheduler#allocate.
(2) Update scheduling queues cache and remove non-pending demand queues when demand queues
updated (size of scheduling queues cache not equal with size of demand queues) in PriorityUtilizationQueueOrderingPolicy#getAssignmentIterator.

* use getAllPending to filter scheduling queues, because nodes in non-exclusive partition
can allocate resource for requests of default partition.
* fix problems of failed test cases

The cost time of scheduling will not grow linearly through this improvement, performance enhancements
are 110% for 500 queues, 230% for 1000 queues and over 1000% for 5000 queues.
Testing result:
{noformat}
Before:
#QueueSize = 5000, testing times : 1000, total cost : 7353788602 ns, average cost : 7353788.5
ns.
#QueueSize = 5000, testing times : 1000, total cost : 7677551118 ns, average cost : 7677551.0
ns.
#QueueSize = 1000, testing times : 1000, total cost : 1873387351 ns, average cost : 1873387.4
ns.
#QueueSize = 1000, testing times : 1000, total cost : 1858447758 ns, average cost : 1858447.8
ns.
#QueueSize = 500, testing times : 1000, total cost : 1165215528 ns, average cost : 1165215.5
ns.
#QueueSize = 500, testing times : 1000, total cost : 1188830091 ns, average cost : 1188830.1
ns.
#QueueSize = 100, testing times : 1000, total cost : 591136755 ns, average cost : 591136.75
ns.
#QueueSize = 100, testing times : 1000, total cost : 582527533 ns, average cost : 582527.56
ns.
After:
#QueueSize = 5000, testing times : 1000, total cost time : 631647431 ns, average cost time
: 631647.44 ns.
#QueueSize = 1000, testing times : 1000, total cost time : 548629986 ns, average cost time
: 548630.0 ns.
#QueueSize = 500, testing times : 1000, total cost time : 565621632 ns, average cost time
: 565621.6 ns.
#QueueSize = 100, testing times : 1000, total cost time : 497367467 ns, average cost time
: 497367.47 ns.
{noformat}

> Skip unnecessary sorting and iterating process for child queues without pending resource
to optimize schedule performance
> -------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-7005
>                 URL: https://issues.apache.org/jira/browse/YARN-7005
>             Project: Hadoop YARN
>          Issue Type: Improvement
>    Affects Versions: 2.9.0, 3.0.0-alpha4
>            Reporter: Tao Yang
>         Attachments: YARN-7005.001.patch, YARN-7005.002.patch, YARN-7005.003.patch
>
>
> Nowadays even if there is only one pending app in a queue, the scheduling process will
go through all queues anyway and costs most of time on sorting and iterating child queues
in ParentQueue#assignContainersToChildQueues. 
> IIUIC, queues that have no pending resource can be skipped for sorting and iterating
process to reduce time cost, obviously for a cluster with many queues. Please feel free to
correct me if I ignore something else. Thanks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message