hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-6529) AppMaster will not retry to request resource if AppMaster happens to decide to not use the resource
Date Fri, 30 Oct 2015 20:17:27 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983230#comment-14983230

Jason Lowe commented on MAPREDUCE-6529:

bq. For example, in a heterogeneous cluster, reduce task may prefer a container on powerful
machine with higher I/O speed. MPI job may prefer containers on machine with higher cpu frequency.
But RM can't know all the resource requirement for different applications. So I am just wandering
if the RPC protocol between RM and AM may provide such new interface, let the AM make a second
choice about if AM will use the container it just gets.

If the AM has a preference for something then it needs to specify that in the locality request.
 Failing to do so just leads to a Monte Carlo situation where the AM tosses away containers
hoping that the next random container is better than the last while the task starves for resources
in the interim.  The AM doesn't have a cluster-wide view.  It's not going to know that the
nodes it desires so much are all occupied, nor does it know what other things are running
on various nodes.  So I still think it's undesirable to have the AM toss away a container
because it hopes a better one will come along later.  Instead we should have the AM improve
the container request so the RM can better know the AM's intentions.

For example, if the cluster is heterogeneous with some nodes being much better for reducers
than others then we should label those nodes.  Then the AM request can ask reducers to use
those labeled nodes with the ability to relax that locality constraint if the nodes are unavailable.
 The RM will make a much better decision more efficiently than the AM could ever hope to do
on its own.  Otherwise the AM is going to get a container, see it's not on one of the "good"
nodes and toss it, request another, get another bad allocation, rinse, repeat.  If we can
teach the MapReduce AM how to recognize a good node vs. a bad node then we can also teach
it how to request those nodes when it makes the initial allocation request.

> AppMaster will not retry to request resource if AppMaster happens to decide to not use
the resource
> ---------------------------------------------------------------------------------------------------
>                 Key: MAPREDUCE-6529
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6529
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mr-am
>    Affects Versions: 2.6.0
>            Reporter: Wei Chen
> I am viewing code in RMContainerAllocator.java.   I want to do some improvement  so that
the AppMaster could give up some containers that may not be optimal  when it receives new
assigned containers.  But I found that if AppMaster give up the containers, it will not retry
to request the resource again.
> int RMContainerRequestor.java, Set<ResourceRequest> ask  is used to ask resource
from ResourceManager. I found each container could only be requested once. It mean ask can
be filled by addResourceRequestToAsk(ResourceRequest remoteRequest[]), but it can only added
for once for each container. If we give up one assigned container, It will never request again

This message was sent by Atlassian JIRA

View raw message