hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hung (JIRA)" <j...@apache.org>
Subject [jira] [Created] (YARN-6818) User limit per partition is not honored in branch-2.7 >=
Date Thu, 13 Jul 2017 04:33:00 GMT
Jonathan Hung created YARN-6818:

             Summary: User limit per partition is not honored in branch-2.7 >=
                 Key: YARN-6818
                 URL: https://issues.apache.org/jira/browse/YARN-6818
             Project: Hadoop YARN
          Issue Type: Bug
            Reporter: Jonathan Hung
            Assignee: Jonathan Hung

We are seeing an issue where user limit factor does not cap the amount of resources a user
can consume in a queue in a partition. Suppose you have a queue with access to partition X,
used resources in default partition is 0, and used resources in partition X is at the partition's
user limit. This is the problematic code as far as I can tell: (in LeafQueue.java){noformat}
   if (Resources
        .greaterThan(resourceCalculator, clusterResource,
            limit)) {
      // if enabled, check to see if could we potentially use this node instead
      // of a reserved node if the application has reserved containers
      if (this.reservationsContinueLooking) {
        if (Resources.lessThanOrEqual(
            Resources.subtract(user.getUsed(), application.getCurrentReservation()),
            limit)) {

          if (LOG.isDebugEnabled()) {
            LOG.debug("User " + userName + " in queue " + getQueueName()
                + " will exceed limit based on reservations - " + " consumed: "
                + user.getUsed() + " reserved: "
                + application.getCurrentReservation() + " limit: " + limit);
          Resource amountNeededToUnreserve = Resources.subtract(user.getUsed(label), limit);
          // we can only acquire a new container if we unreserve first since we ignored the
          // user limit. Choose the max of user limit or what was previously set by max
          // capacity.
              clusterResource, currentResoureLimits.getAmountNeededUnreserve(),
          return true;
      if (LOG.isDebugEnabled()) {
        LOG.debug("User " + userName + " in queue " + getQueueName()
            + " will exceed limit - " + " consumed: "
            + user.getUsed() + " limit: " + limit);
      return false;
First it sees the used resources in partition X is greater than partition's user limit. Then
the reservation check also succeeds because it is checking {{user.getUsed() - application.getCurrentReservation()
<= limit}} and returns true.

One fix is to just set {{Resources.subtract(user.getUsed(), application.getCurrentReservation())}}
to {{Resources.subtract(user.getUsed(label), application.getCurrentReservation())}}.

This doesn't seem to be a problem in branch-2.8 and higher since YARN-3356 introduces this
check: {noformat}      if (this.reservationsContinueLooking && checkReservations
          && label.equals(CommonNodeLabelsManager.NO_LABEL)) {{noformat}
so in this case getting the used resources in default partition seems to be correct.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message