hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "zhengchenyu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-5846) Improve the fairscheduler attemptScheduler
Date Wed, 09 Nov 2016 09:30:58 GMT

     [ https://issues.apache.org/jira/browse/YARN-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

zhengchenyu updated YARN-5846:
------------------------------
    Priority: Critical  (was: Minor)

> Improve the fairscheduler attemptScheduler 
> -------------------------------------------
>
>                 Key: YARN-5846
>                 URL: https://issues.apache.org/jira/browse/YARN-5846
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: fairscheduler
>    Affects Versions: 2.7.1
>         Environment: CentOS-7.1
>            Reporter: zhengchenyu
>            Priority: Critical
>              Labels: fairscheduler
>             Fix For: 2.7.1
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> when I assign a container, we must consider two factor:
>     (1) sort the queue and application, and select the proper request. 
>     (2) then we assure this request's host is just this node (data locality). or skip
this loop!
> this algorithm regard the sorting queue and application as primary factor. when yarn
consider data locality, for example, yarn.scheduler.fair.locality.threshold.node=1, yarn.scheduler.fair.locality.threshold.rack=1
(or yarn.scheduler.fair.locality-delay-rack-ms and yarn.scheduler.fair.locality-delay-node-ms
is very large) and lots of applications are runnig, the process of assigning contianer becomes
very slow.
> I think data locality is more important then the sequence of the queue and applications.

> I wanna a new algorithm like this:
> 	(1) when resourcemanager accept a new request, notice the RMNodeImpl, and then record
this association between RMNode and request
> 	(2) when assign containers for node, we assign container by RMNodeImpl's association
between RMNode and request directly
> 	(3) then I consider the priority of queue and applation. In one object of RMNodeImpl,
we sort the request of association.
> 	(4) and I think the sorting of current algorithm is consuming, in especial, losts of
applications are running, lots of sorting are called. so I think we should sort the queue
and applicaiton in a daemon thread, because less error of queues's sequences is allowed.
> 	
> 	
> 	
> 	



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message