Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id ECCB0200C04 for ; Tue, 20 Dec 2016 03:44:00 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id EB79A160B30; Tue, 20 Dec 2016 02:44:00 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 44D02160B21 for ; Tue, 20 Dec 2016 03:44:00 +0100 (CET) Received: (qmail 397 invoked by uid 500); 20 Dec 2016 02:43:59 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 306 invoked by uid 99); 20 Dec 2016 02:43:59 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Dec 2016 02:43:59 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 97FDB2C03E8 for ; Tue, 20 Dec 2016 02:43:58 +0000 (UTC) Date: Tue, 20 Dec 2016 02:43:58 +0000 (UTC) From: "zhengchenyu (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-5936) when cpu strict mode is closed, yarn couldn't assure scheduling fairness between containers MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 20 Dec 2016 02:44:01 -0000 [ https://issues.apache.org/jira/browse/YARN-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15763023#comment-15763023 ] zhengchenyu commented on YARN-5936: ----------------------------------- But your "time" command is only related to it's own program, every program is single thread. My question is below: two process that has different numbers of threads has different ability of schedule, though they have the same cpu.share. I know the reason, but I don't have a proper suggestion to avoid this problem. > when cpu strict mode is closed, yarn couldn't assure scheduling fairness between containers > ------------------------------------------------------------------------------------------- > > Key: YARN-5936 > URL: https://issues.apache.org/jira/browse/YARN-5936 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Affects Versions: 2.7.1 > Environment: CentOS7.1 > Reporter: zhengchenyu > Priority: Critical > Fix For: 2.7.1 > > Original Estimate: 1m > Remaining Estimate: 1m > > When using LinuxContainer, the setting that "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage" is true could assure scheduling fairness with the cpu bandwith of cgroup. But the cpu bandwidth of cgroup would lead to bad performance in our experience. > Without cpu bandwidth of cgroup, cpu.share of cgroup is our only way to assure scheduling fairness, but it is not completely effective. For example, There are two container that have same vcore(means same cpu.share), one container is single-threaded, the other container is multi-thread. the multi-thread will have more CPU time, It's unreasonable! > Here is my test case, I submit two distributedshell application. And two commmand are below: > {code} > hadoop jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar org.apache.hadoop.yarn.applications.distributedshell.Client -jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar -shell_script ./run.sh -shell_args 10 -num_containers 1 -container_memory 1024 -container_vcores 1 -master_memory 1024 -master_vcores 1 -priority 10 > hadoop jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar org.apache.hadoop.yarn.applications.distributedshell.Client -jar share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.7.1.jar -shell_script ./run.sh -shell_args 1 -num_containers 1 -container_memory 1024 -container_vcores 1 -master_memory 1024 -master_vcores 1 -priority 10 > {code} > here show the cpu time of the two container: > {code} > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 15448 yarn 20 0 9059592 28336 9180 S 998.7 0.1 24:09.30 java > 15026 yarn 20 0 9050340 27480 9188 S 100.0 0.1 3:33.97 java > 13767 yarn 20 0 1799816 381208 18528 S 4.6 1.2 0:30.55 java > 77 root rt 0 0 0 0 S 0.3 0.0 0:00.74 migration/1 > {code} > We find the cpu time of Muliti-Thread are ten times than the cpu time of Single-Thread, though the two container have same cpu.share. > notes: > run.sh > {code} > java -cp /home/yarn/loop.jar:$CLASSPATH loop.loop $1 > {code} > loop.java > {code} > package loop; > public class loop { > public static void main(String[] args) { > // TODO Auto-generated method stub > int loop = 1; > if(args.length>=1) { > System.out.println(args[0]); > loop = Integer.parseInt(args[0]); > } > for(int i=0;i System.out.println("start thread " + i); > new Thread(new Runnable() { > @Override > public void run() { > // TODO Auto-generated method stub > int j=0; > while(true){j++;} > } > }).start(); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: yarn-issues-help@hadoop.apache.org