Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Date: Mon, 5 Dec 2016 05:06:58 +0000 (UTC)
From: "zhengchenyu (JIRA)" <jira@apache.org>
To: yarn-issues@hadoop.apache.org
Message-ID: <JIRA.13023311.1480056951000.435751.1480914418530@Atlassian.JIRA>
In-Reply-To: <JIRA.13023311.1480056951000@Atlassian.JIRA>
References: <JIRA.13023311.1480056951000@Atlassian.JIRA> <JIRA.13023311.1480056951584@arcas>
Subject: [jira] [Commented] (YARN-5936) when cpu strict mode is closed, yarn
 couldn't assure scheduling fairness between containers
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
archived-at: Mon, 05 Dec 2016 05:07:00 -0000


    [ https://issues.apache.org/jira/browse/YARN-5936?page=3Dcom.atlassian.=
jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D15721=
247#comment-15721247 ]=20

zhengchenyu commented on YARN-5936:
-----------------------------------

I think two reason both affect the performance, but I can't evaluate which =
is the major reason.=20
First=EF=BC=8CThe linux kernel source code of cpu bandwidth will add too ma=
ny timer, and add more function to be called.=20
Secondly, limit utilization ratio will lead to bad performance.
Closing the cpu bandwith limit is inevitabe. Here I only wanna to a idea th=
at keep justice when only use cpu share.

> when cpu strict mode is closed, yarn couldn't assure scheduling fairness =
between containers
> -------------------------------------------------------------------------=
------------------
>
>                 Key: YARN-5936
>                 URL: https://issues.apache.org/jira/browse/YARN-5936
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.7.1
>         Environment: CentOS7.1
>            Reporter: zhengchenyu
>            Priority: Critical
>             Fix For: 2.7.1
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> When using LinuxContainer, the setting that "yarn.nodemanager.linux-conta=
iner-executor.cgroups.strict-resource-usage" is true could assure schedulin=
g fairness with the cpu bandwith of cgroup. But the cpu bandwidth of cgroup=
 would lead to bad performance in our experience.=20
>     Without cpu bandwidth of cgroup, cpu.share of cgroup is our only way =
to assure scheduling fairness, but it is not completely effective. For exam=
ple, There are two container that have same vcore(means same cpu.share), on=
e container is single-threaded, the other container is multi-thread. the mu=
lti-thread will have more CPU time, It's unreasonable!
>     Here is my test case, I submit two distributedshell application. And =
two commmand are below:
> {code}
> hadoop jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.=
7.1.jar org.apache.hadoop.yarn.applications.distributedshell.Client -jar sh=
are/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar -shell_=
script ./run.sh  -shell_args 10 -num_containers 1 -container_memory 1024 -c=
ontainer_vcores 1 -master_memory 1024 -master_vcores 1 -priority 10
> hadoop jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.=
7.1.jar org.apache.hadoop.yarn.applications.distributedshell.Client -jar sh=
are/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar -shell_=
script ./run.sh  -shell_args 1  -num_containers 1 -container_memory 1024 -c=
ontainer_vcores 1 -master_memory 1024 -master_vcores 1 -priority 10
> {code}
>      here show the cpu time of the two container:
> {code}
>   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMA=
ND
> 15448 yarn      20   0 9059592  28336   9180 S 998.7  0.1  24:09.30 java
> 15026 yarn      20   0 9050340  27480   9188 S 100.0  0.1   3:33.97 java
> 13767 yarn      20   0 1799816 381208  18528 S   4.6  1.2   0:30.55 java
>    77 root      rt   0       0      0      0 S   0.3  0.0   0:00.74 migra=
tion/1  =20
> {code}
>     We find the cpu time of Muliti-Thread are ten times than the cpu time=
 of Single-Thread, though the two container have same cpu.share.
> notes:
> run.sh
> {code}=20
> =09java -cp /home/yarn/loop.jar:$CLASSPATH loop.loop $1=09
> {code}=20
> loop.java
> {code}=20
> package loop;
> public class loop {
> =09public static void main(String[] args) {
> =09=09// TODO Auto-generated method stub
> =09=09int loop =3D 1;
> =09=09if(args.length>=3D1) {
> =09=09=09System.out.println(args[0]);
> =09=09=09loop =3D Integer.parseInt(args[0]);
> =09=09}
> =09=09for(int i=3D0;i<loop;i++){
> =09=09=09System.out.println("start thread " + i);
> =09=09=09new Thread(new Runnable() {
> =09=09=09=09@Override
> =09=09=09=09public void run() {
> =09=09=09=09=09// TODO Auto-generated method stub
> =09=09=09=09=09int j=3D0;
> =09=09=09=09=09while(true){j++;}
> =09=09=09=09}
> =09=09=09}).start();
> =09=09}
> =09}
> }
> {code}


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org