hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Avner BenHanoch (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5329) APPLICATION_INIT is never sent to AuxServices other than the builtin ShuffleHandler
Date Thu, 29 Aug 2013 15:02:02 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13753691#comment-13753691

Avner BenHanoch commented on MAPREDUCE-5329:

Hi Siddharth,

I'm assuming you're planning to write a different auxiliary service running in the NodeManager
as your alternate shuffle provider ? 

Does this need any node specific configuration? 
No. The Configuration object that any AbstractService gets in its init() method is enough.

How will the consumers find out when and where to fetch data from? 
The consumers will fetch completion events from the AM same as the vanilla ShuffleConsumer

The new ShuffleConsumer can fetch completion events from the AM - which will have a URL constructed
based on the existing ShuffleProvider. Is this information sufficient for your consumer to
fetch the data?
Yes. The URL doesn't contain path to the MOF, but it contains the values of <jobid, mapid>.
The consumer will extract <hostname, jobid, mapid> from the url.  It will contact the
hostname over RDMA and ask its segment for <jobid, mapid>. This is all a provider need
for finding the MOF on one of its local disks (in addition to the mapping jobid->userid
that comes from APPLICATION_INIT messages) *and this is exactly what the vanilla provider

If it is, to get your provider initialized by the NM (i.e. the APPLICATION_INIT event), you'll
need to modify the _serviceData_ in the startContainer call. That is a map containing the
serviceId for the auxiliary service and data that it needs for initialization. In case of
the current ShuffleProvider - this is ShuffleHandler.MAPREDUCE_SHUFFLE_SERVICEID and the token.
You just need to add another entry to this map for your own Provider. If it's using the same
token, send that as the payload.
Great! I would like to do that! 
If I understand correctly, startContainer gets ContainerLaunchContext that is created in createCommonContainerLaunchContext
and this is the right place for adding entries to the serviceData map.
I think the code for that should be similar to what I wrote above in my 24/Jun/13 comment.
*Please let me know if I can continue based on this code.*

If not, the changes are likely to be more extensive.
I prefer to go for the simple case :-)


but if you think it's independent you should just create a jira with details
Thanks.  I’ll create another JIRA with details for supporting multiple aux-services with
different ports and link it to this issue.

I want to add I really appreciate all your efforts on this matter,

> APPLICATION_INIT is never sent to AuxServices other than the builtin ShuffleHandler
> -----------------------------------------------------------------------------------
>                 Key: MAPREDUCE-5329
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5329
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mr-am
>    Affects Versions: 2.0.4-alpha
>            Reporter: Avner BenHanoch
> APPLICATION_INIT is never sent to AuxServices other than the built-in ShuffleHandler.
 This means that 3rd party ShuffleProvider(s) will not be able to function, because APPLICATION_INIT
enables the AuxiliaryService to map jobId->userId. This is needed for properly finding
the MOFs of a job per reducers' requests.
> NOTE: The built-in ShuffleHandler does get APPLICATION_INIT events due to hard-coded
expression in hadoop code. The current TaskAttemptImpl.java code explicitly call: serviceData.put
(ShuffleHandler.MAPREDUCE_SHUFFLE_SERVICEID, ...) and ignores any additional AuxiliaryService.
As a result, only the built-in ShuffleHandler will get APPLICATION_INIT events.  Any 3rd party
AuxillaryService will never get APPLICATION_INIT events.
> I think a solution can be in one of two ways:
> 1. Change TaskAttemptImpl.java to loop on all Auxiliary Services and register each of
them, by calling serviceData.put (…) in loop.
> 2. Change AuxServices.java similar to the fix in: MAPREDUCE-2668  "APPLICATION_STOP is
never sent to AuxServices".  This means that in case the 'handle' method gets APPLICATION_INIT
event it will demultiplex it to all Aux Services regardless of the value in event.getServiceID().
> I prefer the 2nd solution.  I am welcoming any ideas.  I can provide the needed patch
for any option that people like.
> See [Pluggable Shuffle in Hadoop documentation|http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/PluggableShuffleAndPluggableSort.html]

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message