hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-7136) Additional Performance Improvement for Resource Profile Feature
Date Tue, 05 Sep 2017 18:39:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-7136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16154112#comment-16154112

Wangda Tan commented on YARN-7136:

[~jlowe] / [~templedf] / [~sunilg],

Regarding to performance with 3 or more resource types added, I tried to change unit test
and got some results:
By using testUserLimitThroughput:
2 resource types: 57k,
3 resource types: 46k,
4 resource types: 44k,
5 resource types: 42k. 

What I found is, adding one more resource type cause about 5% performance regression, this
is pretty predictable. And once 3rd resource type added to the cluster, the performance has
a 20% performance regression. This definitely need improvement, however I think it may still
be good for people to try this feature. (I haven't done any SLS test for 3rd or more resource
types so far, but I think it should be no problem to scale to at least 5k nodes just by looking
at unit test result).

Regarding to comments from [~templedf], 
bq. In toString(), would it be more efficient to always append the comma and delete the last
one at the end?
I would prefer to suggestion from Jason, it's backward compatible and we don't need a separate
toString method added to BaseResource as well.

bq. In hashCode(), per the discussion on YARN-6788
I'd like to update this part if it's not a good practice. I remember you mentioned this before,
however I couldn't find it in the JIRA, could you remind me the link to the ref? I'm not sure
if integer overflows is a big issue in Java since it automatically wrap the number. 

Regarding to comments from [~jlowe], 
bq. This is part of my concern. I understand it's important to optimize for the default ..
Do you think is it fine according to the perf report for 3+ resource types? I completely agree
we need to continue optimize for 3+ resource types, but my current goal is to avoid impact
perf for existing user cases as much as we can, we can always optimize new use cases later.

Uploading ver.008 patch.

> Additional Performance Improvement for Resource Profile Feature
> ---------------------------------------------------------------
>                 Key: YARN-7136
>                 URL: https://issues.apache.org/jira/browse/YARN-7136
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager, resourcemanager
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>            Priority: Critical
>         Attachments: YARN-7136.001.patch, YARN-7136.YARN-3926.001.patch, YARN-7136.YARN-3926.002.patch,
YARN-7136.YARN-3926.003.patch, YARN-7136.YARN-3926.004.patch, YARN-7136.YARN-3926.005.patch,
YARN-7136.YARN-3926.006.patch, YARN-7136.YARN-3926.007.patch, YARN-7136.YARN-3926.008.patch
> This JIRA is plan to add following misc perf improvements:
> 1) Use final int in Resources/ResourceCalculator to cache #known-resource-types. (Significant
> 2) Catch Java's ArrayOutOfBound Exception instead of checking array.length every time.
(Significant improvement).
> 3) Avoid setUnit validation (which is a HashSet lookup) when initialize default Memory/VCores
ResourceInformation (Significant improvement).
> 4) Avoid unnecessary loop array in Resource#toString/hashCode. (Some improvement).
> 5) Removed readOnlyResources in BaseResource. (Minor improvement).
> 6) Removed enum: MandatoryResources, use final integer instead. (Minor improvement).

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message