hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sunil G (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5342) Improve non-exclusive node partition resource allocation in Capacity Scheduler
Date Fri, 15 Jul 2016 16:56:20 GMT

    [ https://issues.apache.org/jira/browse/YARN-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15379671#comment-15379671

Sunil G commented on YARN-5342:

Thanks [~Naganarasimha Garla] for the insightful thoughts.

By looking into one aspect like *“improve the allocation for non-exclusive label when requests
are from an application of no_label”*, we can try to help each such app to go ahead with
its allocation on a non-exclusive label by not waiting for all node heartbeats.
For that I think we can only look in to that very partition (node’s partition on which a
node heartbeat is under processing for an app), and see whether we can use some resource for
this no_label app. Yes, I agree with your top level view and its good to have an idea about
other non-exclusive partition as well. Since we are having a node with us with some free space
in current heartbeat, if we can push a no_label container here under limits, i think we are
solving problem step by step.
And I very much agree to the comment about the chances of preemption to kick in. I think a
fair balance is to be attained for the speed of allocations for no_label apps on a label against
larger imbalances over queue’s capacity so that preemption may kick in.

So the checks which I have mentioned can be w.r.t an app or its queue so that we will try
to solve the problem specific to each app by app. A much better and high level solution may
cause lot of refactoring I guess. So suggested a simpler approach here. Thoughts?

> Improve non-exclusive node partition resource allocation in Capacity Scheduler
> ------------------------------------------------------------------------------
>                 Key: YARN-5342
>                 URL: https://issues.apache.org/jira/browse/YARN-5342
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Wangda Tan
>            Assignee: Sunil G
>         Attachments: YARN-5342.1.patch
> In the previous implementation, one non-exclusive container allocation is possible when
the missed-opportunity >= #cluster-nodes. And missed-opportunity will be reset when container
allocated to any node.
> This will slow down the frequency of container allocation on non-exclusive node partition:
*When a non-exclusive partition=x has idle resource, we can only allocate one container for
this app in every X=nodemanagers.heartbeat-interval secs for the whole cluster.*
> In this JIRA, I propose a fix to reset missed-opporunity only if we have >0 pending
resource for the non-exclusive partition OR we get allocation from the default partition.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message