Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Date: Tue, 8 Nov 2016 06:10:58 +0000 (UTC)
From: "zhengchenyu (JIRA)" <jira@apache.org>
To: yarn-issues@hadoop.apache.org
Message-ID: <JIRA.13018829.1478525402000.218358.1478585458558@Atlassian.JIRA>
In-Reply-To: <JIRA.13018829.1478525402000@Atlassian.JIRA>
References: <JIRA.13018829.1478525402000@Atlassian.JIRA> <JIRA.13018829.1478525402171@arcas>
Subject: [jira] [Commented] (YARN-5846) Improve the fairscheduler
 attemptScheduler
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
archived-at: Tue, 08 Nov 2016 06:11:00 -0000


    [ https://issues.apache.org/jira/browse/YARN-5846?page=3Dcom.atlassian.=
jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D15646=
635#comment-15646635 ]=20

zhengchenyu commented on YARN-5846:
-----------------------------------

Yeah! I thinks my suggestion may be a new scheduler. And YARN-5139 is indee=
d a good idea, I will follow this issue, thank you for you recommendation!
As to this problem=EF=BC=8CI think a daemon thread would update the shares,=
 and keep the sequence of the queue and applications. In One Node, the requ=
ests are order by this sequence. But I don't known which model is best.=20
For examaple=EF=BC=9A
(1) one node have one request RB tree. updating the sequence of the queue a=
nd applicaiton in a daemon thread will update the sequence (this idea deriv=
es from fair-scheduler of linux kernel, and is compared to the cpu, and req=
uest is compared to task). Then the leftmost node would be the next assigne=
d request.
(2) a global daemon thread update every queue and application, and calculat=
e their share. and request of one node's share is multiplied by its priorit=
y, then sort all the request. we assigned the container by this sequence.

> Improve the fairscheduler attemptScheduler=20
> -------------------------------------------
>
>                 Key: YARN-5846
>                 URL: https://issues.apache.org/jira/browse/YARN-5846
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: fairscheduler
>    Affects Versions: 2.7.1
>         Environment: CentOS-7.1
>            Reporter: zhengchenyu
>            Priority: Minor
>              Labels: fairscheduler
>             Fix For: 2.7.1
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> when I assign a container, we must consider two factor:
>     (1) sort the queue and application, and select the proper request.=20
>     (2) then we assure this request's host is just this node (data locali=
ty). or skip this loop!
> this algorithm regard the sorting queue and application as primary factor=
. when yarn consider data locality, for example, yarn.scheduler.fair.locali=
ty.threshold.node=3D1, yarn.scheduler.fair.locality.threshold.rack=3D1 (or =
yarn.scheduler.fair.locality-delay-rack-ms and yarn.scheduler.fair.locality=
-delay-node-ms is very large) and lots of applications are runnig, the proc=
ess of assigning contianer becomes very slow.
> I think data locality is more important then the sequence of the queue an=
d applications.=20
> I wanna a new algorithm like this:
> =09(1) when resourcemanager accept a new request, notice the RMNodeImpl, =
and then record this association between RMNode and request
> =09(2) when assign containers for node, we assign container by RMNodeImpl=
's association between RMNode and request directly
> =09(3) then I consider the priority of queue and applation. In one object=
 of RMNodeImpl, we sort the request of association.
> =09(4) and I think the sorting of current algorithm is consuming, in espe=
cial, losts of applications are running, lots of sorting are called. so I t=
hink we should sort the queue and applicaiton in a daemon thread, because l=
ess error of queues's sequences is allowed.
> =09
> =09
> =09
> =09


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org