hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Evans <ev...@yahoo-inc.com>
Subject Re: Yarn related questions:
Date Thu, 05 Jan 2012 15:53:25 GMT
Try looking at how the mapreduce code does it.  I think you have to put in a fully qualified
host name, but I don't remember for sure.

--Bobby Evans

On 1/4/12 10:04 PM, "raghavendhra rahul" <raghavendhrarahul@gmail.com> wrote:


I tried to set the client node for launching the container within the application master.
I have set the parameter as
but the containers are not launched in the destined host.Instead the loop goes on continuously.
2012-01-04 15 <tel:2012-01-04%2015> :11:48,535 INFO  appmaster.ApplicationMaster (ApplicationMaster.java:run(204))
- Current application state: loop=95, appDone=false, total=2, requested=2, completed=0, failed=0,

On Wed, Jan 4, 2012 at 11:24 PM, Robert Evans <evans@yahoo-inc.com> wrote:

A container more or less corresponds to a task in MRV1.  There is one exception to this, as
the ApplicationMaster also runs in a container.  The ApplicationMaster will request new containers
for each mapper or reducer task that it wants to launch.  There is separate code from the
container that will serve up the intermediate mapper output and is run as part of the NodeManager
(Similar to the TaskTracker from before).  When the ApplicationMaster requests a container
it also includes with it a hint as to where it would like the container placed.  In fact it
actually makes three request one for the exact node, one for the rack the node is on, and
one that is generic and could be anywhere.  The scheduler will try to honor those requests
in the same order so data locality is still considered and generally honored.  Yes there is
the possibility of back and forth to get a container, but the ApplicationMaster generally
will try to use all of the containers that it is given, even if they are not optimal.

--Bobby Evans

On 1/4/12 10:23 AM, "Ann Pal" <ann_r_pal@yahoo.com <http://ann_r_pal@yahoo.com> >

I am trying to understand more about Hadoop Next Gen Map Reduce and had the following questions
based on the following post:


[1] How does application decide how many containers it needs? The containers are used to store
the intermediate result at the map nodes?

[2] During resource allocation, if the resource manager has no mapping between map tasks to
resources allocated, how can it properly allocate the right resources. It might end up allocating
resources on a node, which does not have data for the map task, and hence is not optimal.
In this case the Application Master will have to reject it and request again . There could
be considerable back- and- forth between application master and resource manager before it
could converge. Is this right?


View raw message