hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jian He (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1712) Admission Control: plan follower
Date Tue, 09 Sep 2014 00:40:29 GMT

    [ https://issues.apache.org/jira/browse/YARN-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14126397#comment-14126397
] 

Jian He commented on YARN-1712:
-------------------------------

Thanks Subra and Carlo working on the patch. Some comments and questions on the patch:

- I think the default queue can be initialized upfront when PlanQueue is initialized in CapacityScheduler
{code}
      // Add default queue if it doesnt exist
      if (scheduler.getQueue(defPlanQName) == null) {
{code}
- Consolidate the comments into 2 lines
{code}
      // identify the reservations that have expired and new reservations that
      // have to
      // be activated
{code}
- Exceptions like the following are ignored. Is this intentional ?
{code}
     } catch (YarnException e) {
          LOG.warn(
              "Exception while trying to release default queue capacity for plan: {}",
              planQueueName, e);
        }
{code}
- may be create a common method to calculate lhsRes and rhsRes
{code}
      CSQueue lhsQueue = scheduler.getQueue(lhs.getReservationId().toString());
      if (lhsQueue != null) {
        lhsRes =
            Resources.subtract(
                lhs.getResourcesAtTime(now),
                Resources.multiply(clusterResource,
                    lhsQueue.getAbsoluteCapacity()));
      } else {
        lhsRes = lhs.getResourcesAtTime(now);
      }
{code}
- allocatedCapacity, may rename to reservedResources
{code}
      Resource allocatedCapacity = Resource.newInstance(0, 0);
{code}
- Instead of doing the following:  
{code}
      for (CSQueue resQueue : resQueues) {
        previousReservations.add(resQueue.getQueueName());
      }
      Set<String> expired =
          Sets.difference(previousReservations, curReservationNames);
      Set<String> toAdd =
          Sets.difference(curReservationNames, previousReservations);
{code}
we can do something like this to save some time cost. 
{code}
for queue in previousReservations:
    if (queue not in curReservationNames)
        expired.add(queue)
    else:
       curReservationNames.remove(queue) // curReservationNames contains the ToAdd queues
in the end
{code}
- Not sure if this method is only used by PlanFollower. If it is, we can change the return
value to be a set of reservation names so that we don’t need to loop later to get all the
reservation names..
{code}
      Set<ReservationAllocation> currentReservations =
          plan.getReservationsAtTime(now);
{code}
- rename defPlanQName to defReservationQueue
{code}
      String defPlanQName = planQueueName + PlanQueue.DEFAULT_QUEUE_SUFFIX;
{code}
- The apps are already in current planQueue, IIUC, this is the defaultReservationQueue ? If
that, I think we may change the queueName parameter to the proper defaultReservationQueue
name. Also, AbstractYarnScheduler#moveAllApps is actually expecting the queue to be leafQueue(ReservationQueue),
not planQueue(parentQueue).
{code}
// Move all the apps in these queues to the PlanQueue
moveAppsInQueues(toMove, planQueueName);
{code}
- I’m thinking if we can make PlanFollower synchronously move apps to the defaultQueue,
for the following reasons:
{code}
1. IIUC, the logic for moveAll and killAll is that: the first Time synchronizePlan is called,
it will try to move all expired apps; next Time synchronizePlan is called, it will kill all
the previously not-yet-moved apps. If the synchronizePlan interval is very small,  it’s
likely to kill most apps that are being move.
2. Exceptions from CapacityScheduler#moveApplication are currently just ignored, if doing
asynchronously 
3. PlanFollower is now anyways locking the whole scheduler in synchronizePlan method (though
I’m still thinking if we need to lock the whole scheduler as this is kind of costly.)
4. In AbstractYarnScheduler#moveAllApps, We can do the moveApp synchronously, and still send
events to RMApp to update its bookkeeping if needed. (But I don’t think we need to send
the event now.)
5. PlanFollower move logic should be much simpler if doing synchronously 
{code}

> Admission Control: plan follower
> --------------------------------
>
>                 Key: YARN-1712
>                 URL: https://issues.apache.org/jira/browse/YARN-1712
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacityscheduler, resourcemanager
>            Reporter: Carlo Curino
>            Assignee: Carlo Curino
>              Labels: reservations, scheduler
>         Attachments: YARN-1712.1.patch, YARN-1712.2.patch, YARN-1712.patch
>
>
> This JIRA tracks a thread that continuously propagates the current state of an inventory
subsystem to the scheduler. As the inventory subsystem store the "plan" of how the resources
should be subdivided, the work we propose in this JIRA realizes such plan by dynamically instructing
the CapacityScheduler to add/remove/resize queues to follow the plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message