hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-7136) Additional Performance Improvement for Resource Profile Feature
Date Tue, 05 Sep 2017 14:45:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-7136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16153762#comment-16153762
] 

Jason Lowe commented on YARN-7136:
----------------------------------

Apologies for the delay, as I was offline during the long weekend.

bq. BaseResource overwrite equals, there're lots of performance issues once we add 3rd resource
to the system. 

This is part of my concern.  I understand it's important to optimize for the default, only
mem and vcores case.  However the whole point of this feature is to allow extra resources
to be configured.  If the performance hit of doing so is severe then it becomes untenable
to many who would like to use it.  Therefore I don't think it's appropriate to dismiss performance
changes that affect the non-default codepath and only focus on the default code path.  Aren't
both important here?  I'm thinking with enough effort on optimizing the >2 resource case
we could get to a point where we may not need BaseResource.  Ideally we should make the processing
for new resources as close to the processing we're doing for memory and vcores, i.e.: no name
checking, no unit conversion, just integer computations.  If the RM was doing that, I wouldn't
think it would be necessary to optimize for the =2 case.

bq. Probably a better solution is directly normalize all resource types to default unit specified
in resource-types.xml when all Resource get initialized. We can have a flag to only do this
in RM side. It is definitely worth to do, do you think is it better to do this after merge?

Yes, I'm totally OK with addressing the unit conversion issue in a followup JIRA.  I'm not
yet convinced a flag is the best approach to handling this, but we can work through the details
in that other JIRA.

bq. Currently resource types are loaded when RM start, so this couldn't happen now. Adding
resource type to a running RM is very tricky, we can spend more time to solve the problem
when it is really needed.

OK, so if this is only setup and fixed in place on RM start then we _can_ cache the number
of resources in a static field, at least for now.  I'm OK if we want to leave this as a volatile
for the future when refresh is supported, but it's confusing now when we're taking the performance
hit for something we don't support.

Regarding [~templedf]'s review:
bq. In {Resource.compareTo()}}, should the check if (arrLenThis < arrLenOther) come at
the beginning?
The problem with putting the check at the beginning is it changes the semantic behavior of
the compare when one resource comes in with less resource information slots than the other.
 According to the logic in the this compare wants to simulate the lexicographic sort done
for strings, and "z" would appear after "abcde" even though "z" is shorter.  If we do the
length check before checking the order of the slots that both have then the order will change
relative to how the logic works in the patch now.  For example, if resource A is mem=12,vcores=10
and resource B is mem=10,vcores=5,gpu=1 do we want A < B because A has no GPU or A >
B because A is asking for more memory than B, therefore it doesn't matter than it has no GPU
slot?  The comments imply we want the latter which means the code must check for a common
prefix up to the shorter of the two lengths before checking the lengths.

bq. In DominantResourceCalculator.calculateSharesForTwoMandatoryResources(), is it really
worth adding the length to the signature? You're retrieving the value in all the other methods,
so why not there as well?

Unless I'm missing something, calculateSharesForTwoMandatoryResources() already knows how
many resources it needs to calculate.  The answer is in its method name.  It's the whole point
of the function -- to calculate for a fixed, known number of resources.  I agree with Daniel
that I don't see why it needs to be told how many resources there are in the signature, but
I don't see why it even needs to look it up itself.  It's custom-designed to calculate for
two resource types.  The one place that calls it explicitly checks for ==2 resources and only
calls it for that case.  The method itself shouldn't even be looking at clusterRes.length
or a passed-in parameter.  The for loop should be {{for (int i = 0; i < 2; ++i)}}.  Even
better, the loop can be manually unrolled since it's just for the two iterations.  There's
no need to do first and second comparisons on the first iteration, only the second, so unrolling
it could make it a bit simpler to read as well as faster in practice.

Regarding the latest patch:

When _do_ we want to show "Mi" for the units for memory if not in toString?  Assuming the
better fix isn't to simply remove the units from memory, we can improve on this backwards-compatibility
case in the toString method.  It's expensive to check for something in a loop that we can
precompute outside of the loop.  We already know we have memory and cpu, so I don't see why
we need to do a name lookup and string compare on every entry.  It would be more efficient
to do something more like what it was doing originally so we eliminate all conditionals within
the loop.  For example:
{code}
   public String toString() {
     StringBuilder sb = new StringBuilder();
     sb.append("<memory:").append(getMemorySize()).append(", vCores:")
         .append(getVirtualCores());
     for (int i = 2; i < resources.length; i++) {
       sb.append(", ");
       ResourceInformation ri = resources[i];
       sb.append(ri.getName()).append(": ")
           .append(ri.getValue())
           .append(ri.getUnits());
     }

     sb.append(">");
     return sb.toString();
   }
{code}

> Additional Performance Improvement for Resource Profile Feature
> ---------------------------------------------------------------
>
>                 Key: YARN-7136
>                 URL: https://issues.apache.org/jira/browse/YARN-7136
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager, resourcemanager
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>            Priority: Critical
>         Attachments: YARN-7136.001.patch, YARN-7136.YARN-3926.001.patch, YARN-7136.YARN-3926.002.patch,
YARN-7136.YARN-3926.003.patch, YARN-7136.YARN-3926.004.patch, YARN-7136.YARN-3926.005.patch,
YARN-7136.YARN-3926.006.patch, YARN-7136.YARN-3926.007.patch
>
>
> This JIRA is plan to add following misc perf improvements:
> 1) Use final int in Resources/ResourceCalculator to cache #known-resource-types. (Significant
improvement).
> 2) Catch Java's ArrayOutOfBound Exception instead of checking array.length every time.
(Significant improvement).
> 3) Avoid setUnit validation (which is a HashSet lookup) when initialize default Memory/VCores
ResourceInformation (Significant improvement).
> 4) Avoid unnecessary loop array in Resource#toString/hashCode. (Some improvement).
> 5) Removed readOnlyResources in BaseResource. (Minor improvement).
> 6) Removed enum: MandatoryResources, use final integer instead. (Minor improvement).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message