hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jian He (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1651) CapacityScheduler side changes to support increase/decrease container resource.
Date Thu, 10 Sep 2015 10:45:46 GMT

    [ https://issues.apache.org/jira/browse/YARN-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738568#comment-14738568
] 

Jian He commented on YARN-1651:
-------------------------------

bq. I think we may need add such information to AMRMProtocol to make sure AM will be notified.
For now, we can keep them as-is. Users can still get such information from RM logs.
I think for now we can fail the allocate call explicitly on those very clear situations in
checkAndNormalizeContainerChangeRequest ?, e.g. the situation that rmContainer doesn't exist
That's more explicit to users. Digging through logs is not an easy thing for application writer.

thanks for updating, Wangda ! some more comments focusing on decreasing code path.

- this may be not correct, because reserve event can happen on RESERVE state too, i.e. reReservation
{code}
      if (container.getState() != RMContainerState.NEW) {
        container.hasIncreaseReservation = true;
      }
{code}
 - RMNodeImpl#toBeDecreasedContainers - no need to be a map, it can be a list ? and therefore
NodeHeartBeatResponse and Impl change is not needed; similarly nmReportedIncreasedContainers
can be a list.
 - When decreasing a container, should it send RMNodeDecreaseContainerEvent too ?
 - revert ContainerManagerImpl change
 - Remove SchedulerApplicationAttempt#getIncreaseRequests
 - In AbstractYarnScheduler#deceraseContainers() move checkAndNormalizeContainerChangeRequests(decreaseRequests,
false) to the same place as checkAndNormalizeContainerChangeRequests(increaseRequests, false)
for consistency.
- this if condition is not needed.
{code}
  public boolean unreserve(Priority priority,
      FiCaSchedulerNode node, RMContainer rmContainer) {
    if (rmContainer.hasIncreaseReservation()) {
      rmContainer.cancelIncreaseReservation();
    }
    {code}
 - looks like when decreasing reservedIncreasedContainer, it will unreserve the *whole* extra
reserved resource, should it only unreserve the extra resources being decresed ?
 - In general, I think we should be able to decrease/increase a regular reserved container
or a increasedReservedContainer ? 
- In ParentQueue, this null check is not needed.
{code}
  @Override
  public void decreaseContainer(Resource clusterResource,
      SchedContainerChangeRequest decreaseRequest,
      FiCaSchedulerApp app) {
    if (app != null) {
    {code}

- allocate call is specifically marked as noLock, but now every allocate call holds the global
scheduler lock which is too expensive. we can move decreaseContainer to application itself.
 
{code}   protected synchronized void decreaseContainer( {code}
It is also now holding queue Lock on allocate, which is also expensive, because that means
a bunch of malicious AMs can effectively block queue's execuation.  

> CapacityScheduler side changes to support increase/decrease container resource.
> -------------------------------------------------------------------------------
>
>                 Key: YARN-1651
>                 URL: https://issues.apache.org/jira/browse/YARN-1651
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager, scheduler
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>         Attachments: YARN-1651-1.YARN-1197.patch, YARN-1651-2.YARN-1197.patch, YARN-1651-3.YARN-1197.patch,
YARN-1651-4.YARN-1197.patch, YARN-1651-5.YARN-1197.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message