hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: communication path for task assignment in hadoop 2.X
Date Fri, 12 Apr 2013 04:48:51 GMT
Hi again,

On Fri, Apr 12, 2013 at 9:36 AM, hari <haribaha@gmail.com> wrote:
>> Yes, the heartbeats between the NodeManager and the ResourceManager
>> does not account for container assignments anymore. Container launches
>> are handled by a separate NodeManager-embedded service called the
>> ContainerManager [1]
> Thanks. While I am currently going through the ContainerManger, I happened
> to notice that there is no concept of container slots similar to the task
> slots in the previous versions. Maybe now that there is Resourcemanager, it
> is
> controlling the number of containers that should be launched. Is that the
> case ?

Yes, each NM publishes the resources it has, and the RM keeps track of
how much is in use, etc.. The fixed number slot concept has gone,
replaced by looser, resource-demand based constraints.

>> > 2. Is there a different communication path for task assignment ?
>> > Is the scheduler making the remote calls or are there other classes
>> > outside
>> > of yarn responsible for making the remote calls ?
>> The latter. Scheduler no longer is responsible for asking NodeManagers
>> to launch the containers. An ApplicationMaster just asks Scheduler to
>> reserve it some containers with specified resource (Memory/CPU/etc.)
>> allocations on available or preferred NodeManagers, and then once the
>> ApplicationMaster gets a response back that the allocation succeeded,
>> it manually communicates with ContainerManager on the allocated
>> NodeManager, to launch the command needed to run the 'task'.
>> A good-enough example to read would be the DistributedShell example.
>> I've linked [3] to show where the above AppMaster -> ContainerManager
>> requesting happens in its ApplicationMaster implementation, which
>> should help clear this for you.
> Thanks for the pointers. So, for mapreduce applications, is the MRAppMaster
> class the
> ApplicationMaster ?  MRAppMaster should then be responsible for asking
> Resourcemanager for containers and then launching those on Nodemanagers.
> Is that the case ? Also, do all mapreduce job submissions (using "hadoop
> jar" command)
>  use MRAppMaster as their ApplicationMaster ?

Yes to all of the above. Every MR job currently launches its own app-master.

Harsh J

View raw message