hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: NM/AM interaction
Date Wed, 29 May 2013 02:24:14 GMT
All AuxiliaryServices are configured in yarn-site.xml with a service
ID and a class is associated to the defined service ID. For MR2, one
generally adds the two below properties, first defining the name
(Service ID), second defining the class to launch for the defined



As part of the interface, all AuxiliaryServices may submit back some
metadata that they would require clients to be aware of. For the
ShuffleHandler, the port is rather important, so it serializes it via
the getMeta() interface. [1]

As part of any startContainer(…) response from the NodeManager's
ContainerManager service, all metadata of all available auxiliary
services are shipped back as part of a successful response to a
container start request. This is a mapping based on (configured
service ID -> metadata) for every Aux service configured and currently
running. [2]

A client, such as MR2, receives this batch of metadata and
deserializes whatever it is looking for [3] using the service ID name
string it is aware of [4].

[1] - https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServices.java#L195
and https://github.com/apache/hadoop-common/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java#L328
[2] - https://github.com/apache/hadoop-common/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java#L502
[3] - https://github.com/apache/hadoop-common/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java#L165
[4] - https://github.com/apache/hadoop-common/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java#L148

On Wed, May 29, 2013 at 4:13 AM, John Lilley <john.lilley@redpoint.net> wrote:
> I was reading from the HortonWorks blog:
> “How MapReduce shuffle takes advantage of NM’s Auxiliary-services
> The Shuffle functionality required to run a MapReduce (MR) application is
> implemented as an Auxiliary Service. This service starts up a Netty Web
> Server, and knows how to handle MR specific shuffle requests from Reduce
> tasks. The MR AM specifies the service id for the shuffle service, along
> with security tokens that may be required. The NM provides the AM with the
> port on which the shuffle service is running which is passed onto the Reduce
> tasks.”
> How does the AM get the service ID and the port?
> Thanks,
> John

Harsh J

View raw message