hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brook Zhou (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3223) Resource update during NM graceful decommission
Date Wed, 28 Oct 2015 18:48:27 GMT

    [ https://issues.apache.org/jira/browse/YARN-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14978999#comment-14978999

Brook Zhou commented on YARN-3223:

Thanks [~leftnoteasy],  [~djp] for review.

bq. Suggest to use CapacityScheduler#updateNodeAndQueueResource to update resources, we need
to update queue's resource, cluster metrics as well.
That makes sense. I'm currently setting SchedulerNode's usedResource to equal to totalResource,
and keeping totalResource the same. If we use that function, it means totalResource should
be set equal to usedResource, and on recommission we should just revert back to the original
totalResource? I like your way better.

bq. When async scheduling enabled, we need to make sure decommissioing node's total resource
is updated so no new container will be allocated on these nodes.
Even if async scheduling is enabled, we will update the total resource on NODE_UPDATE event
to equal to current usedResource, async scheduling thread will not allocate containers to
the node.

bq.  RMNode itself (RMNode.getState()) is already include the necessary info, so the boolean
parameter sounds like redundant
Agreed. I will let the scheduler decide the current state directly using that function.

bq.  I think we need separated test case to cover resource update during NM decommissioning

Yes, that is definitely going to be added. I just wanted to see if my general ideas were okay
with the community. Thanks!

> Resource update during NM graceful decommission
> -----------------------------------------------
>                 Key: YARN-3223
>                 URL: https://issues.apache.org/jira/browse/YARN-3223
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager, resourcemanager
>    Affects Versions: 2.7.1
>            Reporter: Junping Du
>            Assignee: Brook Zhou
>         Attachments: YARN-3223-v0.patch, YARN-3223-v1.patch
> During NM graceful decommission, we should handle resource update properly, include:
make RMNode keep track of old resource for possible rollback, keep available resource to 0
and used resource get updated when
> container finished.

This message was sent by Atlassian JIRA

View raw message