Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 89A10200CCC for ; Fri, 21 Jul 2017 23:28:11 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 87EF316DF07; Fri, 21 Jul 2017 21:28:11 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A641516DF06 for ; Fri, 21 Jul 2017 23:28:10 +0200 (CEST) Received: (qmail 12644 invoked by uid 500); 21 Jul 2017 21:28:09 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 12633 invoked by uid 99); 21 Jul 2017 21:28:09 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Jul 2017 21:28:09 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 27FEFC0697 for ; Fri, 21 Jul 2017 21:28:09 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id HwFjZ7o5sa8j for ; Fri, 21 Jul 2017 21:28:07 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 250C55FDBF for ; Fri, 21 Jul 2017 21:28:02 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 17629E0DE3 for ; Fri, 21 Jul 2017 21:28:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 4886E21EE7 for ; Fri, 21 Jul 2017 21:28:00 +0000 (UTC) Date: Fri, 21 Jul 2017 21:28:00 +0000 (UTC) From: "Wangda Tan (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-6788) Improve performance of resource profile branch MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 21 Jul 2017 21:28:11 -0000 [ https://issues.apache.org/jira/browse/YARN-6788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16096899#comment-16096899 ] Wangda Tan commented on YARN-6788: ---------------------------------- Thanks [~sunilg], In general looks good, few comments for ResourceUtils.java - indexForResourceInformation => resourceNameToIndex - readOnlyResources => knownResourceTypes - readOnlyResourcesArray => knownResourceTypesArray *Additional items (performance related)* And some more interesting findings, I found there're two places are still affect performances a lot: 10+% performanace regression in total: According to testUserLimitThroughput unit test added in YARN-6775. There're some numbers: 1) BaseResource is created 38003619 times in test. (spend around 2000 ms in total) 2) getResourceInformation is invoked 632006616 in test. (spend around 2300 ms in total, among that, map.get() operation takes 1500 ms, unboxing takes 200 ms). (Please note that the numbers will not be shown by sampling profile tools such as VisualVM, I used unit test to do these tests) 3) Resource calculation methods such as DRC#compare (called 30000382 times, spent 1900ms); DRC#multiplyAndNormalizeDown (called 10000413 times, spent 1200 ms). I didn't recorded all Resources calculation methods, but they should share similar code path. I tried to benchmark the original SimpleResource, both of creation and getResourceInformation takes 0.1~0.2 sec only. Maybe we should move these operations to a separate patch for easier review, what you think? *Additional items (not performance related)* - UnitsConversionUtils: we should not do unit conversion for every calculation. I'm not sure if we should move unit out of ResourceInformation. (I benchmarked performance of the converter, it is actually very cheap when two units are same, so I think maybe we don't need do this now for perf purpose, but to make a cleaner API before merge branch, probably we should do that). - In addition, I found putting node-resource related initialization code inside ResourceUtils is a little bit messy. Probably move this part of code to somewhere inside NM sub project? (Can be done separately). Tests for microbenchmark: {code} private static class SimpleResource { private long memory; private long vcores; SimpleResource(long memory, long vcores) { this.memory = memory; this.vcores = vcores; } public int getMemory() { return (int)memory; } public void setMemory(int memory) { this.memory = memory; } public long getMemorySize() { return memory; } public void setMemorySize(long memory) { this.memory = memory; } public int getVirtualCores() { return (int)vcores; } public void setVirtualCores(int vcores) { this.vcores = vcores; } } @Test public void testUnitConversionCost() { long start = System.nanoTime(); for (int i = 0; i < 84002142; i++) { String u1 = "m"; String u2 = "m"; UnitsConversionUtil.convert(u1, u2, 1000); } long finish = System.nanoTime(); System.out.println("Time=" + (finish - start) / 1000); } @Test public void testResourceObjectAllocation() { long start = System.nanoTime(); for (int i = 0; i < 38003619; i++) { BaseResource b = new BaseResource(100, 1); } long finish = System.nanoTime(); System.out.println("Time for 3926=" + (finish - start) / 1000); start = System.nanoTime(); for (int i = 0; i < 38003619; i++) { new SimpleResource(100, 1); } finish = System.nanoTime(); System.out.println("Time for trunk=" + (finish - start) / 1000); } @Test public void testRICost() throws YarnException { long start = System.nanoTime(); Resource r = Resource.newInstance(100, 10); for (long i = 0; i < 632006616; i++) { r.getResourceInformation( ResourceInformation.MEMORY_MB.getName()); } long finish = System.nanoTime(); System.out.println("Time for 3926=" + (finish - start) / 1000); // Only test map operation start = System.nanoTime(); for (long i = 0; i < 632006616; i++) { ResourceUtils.getResourceTypeIndex().get( ResourceInformation.MEMORY_MB.getName()); } finish = System.nanoTime(); System.out.println("Time for 3926, get from Map=" + (finish - start) / 1000); // Only test map operation start = System.nanoTime(); Integer x = 1000; int[] y = new int[1024]; for (long i = 0; i < 632006616; i++) { y[x] = 1; } finish = System.nanoTime(); System.out.println("Time for 3926, unboxing=" + (finish - start) / 1000); start = System.nanoTime(); SimpleResource sr = new SimpleResource(100, 1); for (long i = 0; i < 632006616; i++) { sr.getMemory(); } finish = System.nanoTime(); System.out.println("Time for trunk=" + (finish - start) / 1000); } @Test public void testResourceCalculationCosts() { Resource a = Resource.newInstance(100, 10); Resource b = Resource.newInstance(101, 100); Resource cluster = Resource.newInstance(1000, 1000); DominantResourceCalculator drc = new DominantResourceCalculator(); long start = System.nanoTime(); for (int i = 0; i < 30000382; i++) { drc.compare(cluster, a, b); } long finish = System.nanoTime(); System.out.println("Time for compare=" + (finish - start) / 1000); /// start = System.nanoTime(); for (int i = 0; i < 10000413; i++) { drc.multiplyAndNormalizeDown(a, 1.01, b); } finish = System.nanoTime(); System.out.println("Time for multiplyAndNormalizeDown=" + (finish - start) / 1000); } {code} > Improve performance of resource profile branch > ---------------------------------------------- > > Key: YARN-6788 > URL: https://issues.apache.org/jira/browse/YARN-6788 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager > Reporter: Sunil G > Assignee: Sunil G > Priority: Blocker > Attachments: YARN-6788-YARN-3926.001.patch, YARN-6788-YARN-3926.002.patch, YARN-6788-YARN-3926.003.patch, YARN-6788-YARN-3926.004.patch, YARN-6788-YARN-3926.005.patch, YARN-6788-YARN-3926.006.patch, YARN-6788-YARN-3926.007.patch, YARN-6788-YARN-3926.008.patch, YARN-6788-YARN-3926.009.patch, YARN-6788-YARN-3926.010.patch > > > Currently we could see a 15% performance delta with this branch. > Few performance improvements to improve the same. > Also this patch will handle [comments|https://issues.apache.org/jira/browse/YARN-6761?focusedCommentId=16075418&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16075418] from [~leftnoteasy]. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: yarn-issues-help@hadoop.apache.org