hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-6788) Improve performance of resource profile branch
Date Fri, 21 Jul 2017 21:28:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-6788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16096899#comment-16096899
] 

Wangda Tan commented on YARN-6788:
----------------------------------

Thanks [~sunilg], 

In general looks good, few comments for ResourceUtils.java 
- indexForResourceInformation =>  resourceNameToIndex
- readOnlyResources => knownResourceTypes 
- readOnlyResourcesArray => knownResourceTypesArray

*Additional items (performance related)*

And some more interesting findings, I found there're two places are still affect performances
a lot: 10+% performanace regression in total:
According to testUserLimitThroughput unit test added in YARN-6775. There're some numbers:
1) BaseResource is created 38003619 times in test. (spend around 2000 ms in total)
2) getResourceInformation is invoked 632006616 in test. (spend around 2300 ms in total, among
that, map.get() operation takes 1500 ms, unboxing takes 200 ms). 
(Please note that the numbers will not be shown by sampling profile tools such as VisualVM,
I used unit test to do these tests)
3) Resource calculation methods such as DRC#compare (called 30000382 times, spent 1900ms);
DRC#multiplyAndNormalizeDown (called 10000413 times, spent 1200 ms). I didn't recorded all
Resources calculation methods, but they should share similar code path.

I tried to benchmark the original SimpleResource, both of creation and getResourceInformation
takes 0.1~0.2 sec only.

Maybe we should move these operations to a separate patch for easier review, what you think?

*Additional items (not performance related)*

- UnitsConversionUtils: we should not do unit conversion for every calculation. I'm not sure
if we should move unit out of ResourceInformation. (I benchmarked performance of the converter,
it is actually very cheap when two units are same, so I think maybe we don't need do this
now for perf purpose, but to make a cleaner API before merge branch, probably we should do
that).
- In addition, I found putting node-resource related initialization code inside ResourceUtils
is a little bit messy. Probably move this part of code to somewhere inside NM sub project?
(Can be done separately).

Tests for microbenchmark:
{code}
  private static class SimpleResource {
    private long memory;
    private long vcores;

    SimpleResource(long memory, long vcores) {
      this.memory = memory;
      this.vcores = vcores;
    }

    public int getMemory() {
      return (int)memory;
    }

    public void setMemory(int memory) {
      this.memory = memory;
    }

    public long getMemorySize() {
      return memory;
    }

    public void setMemorySize(long memory) {
      this.memory = memory;
    }

    public int getVirtualCores() {
      return (int)vcores;
    }

    public void setVirtualCores(int vcores) {
      this.vcores = vcores;
    }
  }


  @Test
  public void testUnitConversionCost() {
    long start = System.nanoTime();
    for (int i = 0; i < 84002142; i++) {
      String u1 = "m";
      String u2 = "m";
      UnitsConversionUtil.convert(u1, u2, 1000);
    }
    long finish = System.nanoTime();
    System.out.println("Time=" + (finish - start) / 1000);
  }

  @Test
  public void testResourceObjectAllocation() {
    long start = System.nanoTime();
    for (int i = 0; i < 38003619; i++) {
      BaseResource b = new BaseResource(100, 1);
    }
    long finish = System.nanoTime();
    System.out.println("Time for 3926=" + (finish - start) / 1000);

    start = System.nanoTime();
    for (int i = 0; i < 38003619; i++) {
      new SimpleResource(100, 1);
    }
    finish = System.nanoTime();
    System.out.println("Time for trunk=" + (finish - start) / 1000);
  }

  @Test
  public void testRICost() throws YarnException {
    long start = System.nanoTime();
    Resource r = Resource.newInstance(100, 10);
    for (long i = 0; i < 632006616; i++) {
      r.getResourceInformation( ResourceInformation.MEMORY_MB.getName());
    }
    long finish = System.nanoTime();
    System.out.println("Time for 3926=" + (finish - start) / 1000);


    // Only test map operation
    start = System.nanoTime();
    for (long i = 0; i < 632006616; i++) {
      ResourceUtils.getResourceTypeIndex().get(
          ResourceInformation.MEMORY_MB.getName());
    }
    finish = System.nanoTime();
    System.out.println("Time for 3926, get from Map=" + (finish - start) / 1000);

    // Only test map operation
    start = System.nanoTime();
    Integer x = 1000;
    int[] y = new int[1024];
    for (long i = 0; i < 632006616; i++) {
      y[x] = 1;
    }
    finish = System.nanoTime();
    System.out.println("Time for 3926, unboxing=" + (finish - start) / 1000);

    start = System.nanoTime();
    SimpleResource sr = new SimpleResource(100, 1);
    for (long i = 0; i < 632006616; i++) {
      sr.getMemory();
    }
    finish = System.nanoTime();
    System.out.println("Time for trunk=" + (finish - start) / 1000);
  }

  @Test
  public void testResourceCalculationCosts() {
    Resource a = Resource.newInstance(100, 10);
    Resource b = Resource.newInstance(101, 100);
    Resource cluster = Resource.newInstance(1000, 1000);
    DominantResourceCalculator drc = new DominantResourceCalculator();

    long start = System.nanoTime();
    for (int i = 0; i < 30000382; i++) {
      drc.compare(cluster, a, b);
    }
    long finish = System.nanoTime();
    System.out.println("Time for compare=" + (finish - start) / 1000);

    ///

    start = System.nanoTime();
    for (int i = 0; i < 10000413; i++) {
      drc.multiplyAndNormalizeDown(a, 1.01, b);
    }

    finish = System.nanoTime();
    System.out.println("Time for multiplyAndNormalizeDown=" + (finish - start) / 1000);
  }

{code} 

> Improve performance of resource profile branch
> ----------------------------------------------
>
>                 Key: YARN-6788
>                 URL: https://issues.apache.org/jira/browse/YARN-6788
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager, resourcemanager
>            Reporter: Sunil G
>            Assignee: Sunil G
>            Priority: Blocker
>         Attachments: YARN-6788-YARN-3926.001.patch, YARN-6788-YARN-3926.002.patch, YARN-6788-YARN-3926.003.patch,
YARN-6788-YARN-3926.004.patch, YARN-6788-YARN-3926.005.patch, YARN-6788-YARN-3926.006.patch,
YARN-6788-YARN-3926.007.patch, YARN-6788-YARN-3926.008.patch, YARN-6788-YARN-3926.009.patch,
YARN-6788-YARN-3926.010.patch
>
>
> Currently we could see a 15% performance delta with this branch. 
> Few performance improvements to improve the same.
> Also this patch will handle [comments|https://issues.apache.org/jira/browse/YARN-6761?focusedCommentId=16075418&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16075418]
from [~leftnoteasy].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message