hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun C Murthy <...@hortonworks.com>
Subject Re: Queries on next gen MR architecture
Date Sun, 08 Jan 2012 07:25:38 GMT

On Jan 7, 2012, at 6:47 PM, Praveen Sripati wrote:

> Thanks for the response.
> 
> I was just thinking why some of the design decisions were made with MRv2.
> 
> > No, the OR condition is implied by the hierarchy of requests (node, rack, *).
> 
> If InputSplit1 is on Node11 and Node12 and InputSplit2 on Node21 and Node22. Then the
AM can ask for 1 containers on each of the nodes and * as 2 for map tasks. Then the RM can
return  2 nodes on Node11 and make * as 0. The data locality is lost for InputSplit2 or else
the AM has to make another call to RM releasing one of the container and asking for another
container.

Remember, you also have racks information to guide the RM...

> A bit more complex request specifying the dependencies might be more effective.

At a very high cost - it's very expensive for the RM to track splits for each task across
nodes & racks. To the extent possible, our goal has been to push work to the AM and keep
the RM (and NM) really simple to scale & perform well.

> 
> > NM doesn't make any 'out' calls to anyone by RM, else it would be severe scalability
bottleneck.
> 
> There is already a one-way communication between the AM and NM for launching the containers.
The response can from the NM can hold the list of completed containers from the previous call.
> 

Again, we want too keep the framework (RM/NM) really simple. So, the task can communicate
it's status to the AM itself. 

> > All interactions (RPCs) are authenticated. Also, there is a container token provided
by the RM (during allocation) which is verified by the NM during container launch.
> 
> So, a shared key has to be deployed manually on all the nodes for the NM?

No, it's automatically shared on startup between the daemons.

Arun
Mime
View raw message