hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Abdelnur (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-392) Make it possible to schedule to specific nodes without dropping locality
Date Tue, 23 Apr 2013 19:55:16 GMT

    [ https://issues.apache.org/jira/browse/YARN-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13639491#comment-13639491
] 

Alejandro Abdelnur commented on YARN-392:
-----------------------------------------

The idea of a locality delay gives more flexibility, I like it. 

Though I think it may complicate things quite a bit on the scheduler to be able to do the
right book-keeping. Today, because the delay is at app level there is not delay counting at
allocation requests level.

If we move the delay at allocation request level, it means we'd have to keep a counter at
'rack' level that gets decremented on every allocation attempt and when hits zero we go rack
if node was not fulfilled. 

This means that the allocation request in the scheduler will have to have a new delay-counter
property.

The complexity comes that when an AM places a new allocation request, the AM must provide
the non-empty allocation requests in full.

For example:

Requesting 5 containers for node1/rack1:

{code}
  location=*     - containers=5
  location=rack1 - containers=5
  location=node1 - containers=5
{code}

Requesting 5 additional containers for node2/rack1 (with original allocation still pending):

{code}
  location=*     - containers=10
  location=rack1 - containers=10
  location=node2 - containers=5
{code}

The current contract allows the scheduler just to put the */rack container requests without
having to do a lookup and aggregation.

If we are now keeping a delay counter associated at */rack level and we do a put, we'll reset
the delay-counter for the node1 request family. If we keep the delay-counter of node1 request
family and use it for the node2 request family we'll be shorting the node2 request expected
locality delay.

The delay-locality per container request has value but I think it may require much more work.

Given that, don't you think getting the current approach suggested/implemented by Arun/Sandy
makes sense in the short term?
                
> Make it possible to schedule to specific nodes without dropping locality
> ------------------------------------------------------------------------
>
>                 Key: YARN-392
>                 URL: https://issues.apache.org/jira/browse/YARN-392
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Bikas Saha
>            Assignee: Sandy Ryza
>         Attachments: YARN-392-1.patch, YARN-392-2.patch, YARN-392-2.patch, YARN-392.patch
>
>
> Currently its not possible to specify scheduling requests for specific nodes and nowhere
else. The RM automatically relaxes locality to rack and * and assigns non-specified machines
to the app.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message