hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "zhengchenyu (JIRA)" <j...@apache.org>
Subject [jira] [Created] (YARN-5846) Improve the fairscheduler attemptScheduler
Date Mon, 07 Nov 2016 13:30:58 GMT
zhengchenyu created YARN-5846:

             Summary: Improve the fairscheduler attemptScheduler 
                 Key: YARN-5846
                 URL: https://issues.apache.org/jira/browse/YARN-5846
             Project: Hadoop YARN
          Issue Type: Improvement
          Components: fairscheduler
    Affects Versions: 2.7.1
         Environment: CentOS-7.1
            Reporter: zhengchenyu
            Priority: Minor
             Fix For: 2.7.1

when I assign a container, we must consider two factor:
    (1) sort the queue and application, and select the proper request. 
    (2) then we assure this request's host is just this node (data locality). or skip this
this algorithm regard the sorting queue and application as primary factor. when yarn consider
data locality, for example, yarn.scheduler.fair.locality.threshold.node=1, yarn.scheduler.fair.locality.threshold.rack=1
(or yarn.scheduler.fair.locality-delay-rack-ms and yarn.scheduler.fair.locality-delay-node-ms
is very large) and lots of applications are runnig, the process of assigning contianer becomes
very slow.
I think data locality is more important then the sequence of the queue and applications. 
I wanna a new algorithm like this:
	(1) when resourcemanager accept a new request, notice the RMNodeImpl, and then record this
association between RMNode and request
	(2) when assign containers for node, we assign container by RMNodeImpl's association between
RMNode and request directly
	(3) then I consider the priority of queue and applation. In one object of RMNodeImpl, we
sort the request of association.
	(4) and I think the sorting of current algorithm is consuming, in especial, losts of applications
are running, lots of sorting are called. so I think we should sort the queue and applicaiton
in a daemon thread, because less error of queues's sequences is allowed.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org

View raw message