Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Date: Wed, 29 Mar 2017 06:53:41 +0000 (UTC)
From: "zhengchenyu (JIRA)" <jira@apache.org>
To: yarn-issues@hadoop.apache.org
Message-ID: <JIRA.13059940.1490770134000.151481.1490770421709@Atlassian.JIRA>
In-Reply-To: <JIRA.13059940.1490770134000@Atlassian.JIRA>
References: <JIRA.13059940.1490770134000@Atlassian.JIRA> <JIRA.13059940.1490770134860@jira-lw-us.apache.org>
Subject: [jira] [Comment Edited] (YARN-6407) Improve and fix locks of RM
 scheduler
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
archived-at: Wed, 29 Mar 2017 06:53:46 -0000


    [ https://issues.apache.org/jira/browse/YARN-6407?page=3Dcom.atlassian.=
jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D15946=
630#comment-15946630 ]=20

zhengchenyu edited comment on YARN-6407 at 3/29/17 6:53 AM:
------------------------------------------------------------

[~vinodkv]
Can you give me some advice ? Thanks!


was (Author: zhengchenyu):
[~vinodkv]

> Improve and fix locks of RM scheduler
> -------------------------------------
>
>                 Key: YARN-6407
>                 URL: https://issues.apache.org/jira/browse/YARN-6407
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: fairscheduler
>    Affects Versions: 2.7.1
>         Environment: CentOS 7, 1 Gigabit Ethernet
>            Reporter: zhengchenyu
>             Fix For: 2.7.1
>
>   Original Estimate: 2m
>  Remaining Estimate: 2m
>
> First=EF=BC=8Cthis issue dose not duplicate the YARN-3091.
> In our cluster, we have 5k nodes, and the server is configured with 1 Gig=
abit Ethernet. So network is bottleneck in our cluster.
> We must distcp data from warehouse, because of 1 Gigabit Ethernet, we mus=
t set yarn.scheduler.fair.max.assign to 5, or must lead to hotspot.
> The setting that max.assign is 5 lead to the assigned ability decreased. =
So we start the ContinuousSchedulingThread.=20
> As more applicaitons running in our cluster , and with ContinuousScheduli=
ngThread, the problem of lock contention is more serious.=20
> In our cluster, the callqueue of ApplicationMasterSeriver's rpc is high o=
ccasionally. we worried that more problem occure in future with more applic=
ation are running.
> Here is our logical graph:
> "1 Gigabit Ethernet" and "data hot spot" =3D=3D> "set yarn.scheduler.fair=
.max.assign to 5" =3D=3D> "ContinuousSchedulingThread is started" and "more=
 applcations" =3D> "lock contention"
> I know YARN-3091 solved this problem, but the patch aims that change the =
object lock to read write lock. This change is still Coarse-Grained. So I t=
hink we lock the resources or not lock the large section code.


--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org