hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5889) Improve user-limit calculation in capacity scheduler
Date Fri, 27 Jan 2017 19:11:24 GMT

    [ https://issues.apache.org/jira/browse/YARN-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15843343#comment-15843343

Wangda Tan commented on YARN-5889:

Thanks [~sunilg] for updating the patch.

Some more comments for the latest refactoring:

1) UsersManager#getComputedResourceLimitForAllUsers and FifoIntraQueuePreemptionPlugin#calculateIdealAssignedResourcePerApp:
{{User}} can be removed from parameter list. And calculateIdealAssignedResourcePerApp: partitionBasedResource
is not used, is it a mistake or just a unnecessary param?

2) Are changes of {{testQueueMaxCapacitiesWillNotBeHonoredWhenNotRespectingExclusivity}} related
to this patch?

3) More comments for the latest UsersManager:

3.1 Better to move following two methods from User to UsersManager:

    public void assignContainer(Resource resource, String nodePartition) {
      userResourceUsage.incUsed(nodePartition, resource);

      usersManager.incResourceUsagePerUser(resource, nodePartition, userName);

    public void releaseContainer(Resource resource, String nodePartition) {
      userResourceUsage.decUsed(nodePartition, resource);

      usersManager.decResourceUsagePerUser(resource, nodePartition, userName);

Instead of invoking usersManager from User, it's better to do that in the reverse way. When
resource allocated/released, it calls UsersManager#inc/decResourceUsagePerUser, and calls
following methods:

1. User#incUsed
2. UM#userLimitNeedsRecompute
3. UM#updateQueueUsageRatio

3.2 For UM#inc/decResourceUsagePerUser, now the logic is a little confusing:

We already stored user's usage in User#userResourceUsage, I think we don't need a Map<UserName,
ResourceUsage> for activeUsersResourceUsage/nonActiveUsersResourceUsage. What we only need
is Set<UserName> for active user and non-active user, correct?

So existing logic of {{UM#inc/decResourceUsagePerUser}} should be changed to:

UM#incResourceUsagePerUser() {
  writelock {
      /* As mentioend above (3.1), adding following logic
       * 1. User#incUsed
	   * 2. UM#userLimitNeedsRecompute
	   * 3. UM#updateQueueUsageRatio

	  // ... keep existing logics to updawte total usage

	  if (active-set.contains(userName)) {
	     // increase total-active-usage
	  } else if (non-active-set.contains(userName)) {
	     // increase total-non-active-usage
	  } else {
	     // User's neither in active set nor non-active set, is it possible?
	     // I think it is not possible if other logics are correct. (@Sunil)

And we should keep your logic when user moved between active and non-active state, we will
update both of active/non-active usages.

3.3 For activateApplication/deactivateApplication, 
- Use writeLock? And annotation for lock seems outdated, please check all {{@Lock}} inside
the class
- Since {{updatePerUserResourceUsage}} has two duplicated logics -- toActive=false/true, it
might be better to have two method: updateUsageForNew(Non)ActiveUser.
- setUserAddedOrRemoved looks like a duplicated logic of {{userLimitNeedsRecompute}}, I think
we can remove the logic and can simplify a little bit for other method like {{getComputedResourceLimitForActiveUsers}}

3.4 Some miscs:
- Could you make sure all Log.DEBUG is wrapped by isDebugEnabled?
- Add Log.DEBUG for resource usage update of total/active/non-active? Which can help troubleshooting
potential issues a lot.
- In {{computeUserLimit}}: resourceUsed/usersCount initial value is redundant. And {{//  
   resourceUsed = currentCapacity;}} is commented intentially or by mistake? 
- getActiveUsersResourceUsage/getActiveUsersResourceUsage are never used by anyone.
- {{UserToPartitionRecord}} is only consumed by UsersManager, if User is a internal class
of UsersManager, I would suggest to convert UserToPartitionRecord to an internal class as

> Improve user-limit calculation in capacity scheduler
> ----------------------------------------------------
>                 Key: YARN-5889
>                 URL: https://issues.apache.org/jira/browse/YARN-5889
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler
>            Reporter: Sunil G
>            Assignee: Sunil G
>         Attachments: YARN-5889.0001.patch, YARN-5889.0001.suggested.patchnotes, YARN-5889.0002.patch,
YARN-5889.0003.patch, YARN-5889.0004.patch, YARN-5889.0005.patch, YARN-5889.0006.patch, YARN-5889.v0.patch,
YARN-5889.v1.patch, YARN-5889.v2.patch
> Currently user-limit is computed during every heartbeat allocation cycle with a write
lock. To improve performance, this tickets is focussing on moving user-limit calculation out
of heartbeat allocation flow.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message