hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Avner BenHanoch (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service
Date Mon, 07 May 2012 09:14:03 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269467#comment-13269467

Avner BenHanoch commented on MAPREDUCE-4049:

I am returning to Mariappan last comment with more details:
Bottom line, I accept your design for the trunk! In the trunk, I don't need anything for ShuffleProvider.
 For ShuffleConsumer, after your patch for the trunk is accepted, I can implement ReduceSortPlugin
and provide my implementation for Shuffle&Merge.
Still, there are minor thing that I need to add to your patch (if possible, I prefer that
you'll do it):
I would like to have the following classes as public (all are in package org.apache.hadoop.mapreduce.task.reduce):
EventFetcher , ShuffleScheduler, MapOutput , ShuffleClientMetrics.  The last class also requires
changing its CTOR to be public.
I will be glad to know if it is possible to include my requests in your patch.
Also, I will be glad to know when your patch (including above requests) will be integrated
into trunk.
After that, I will be happy to know, if it will be possible to backport this patch into hadoop-2.x
& hadoop-1.x (for 1.x I can supply my own original Shuffle patch without Merge Plugin,
since anyhow it is different system).

> plugin for generic shuffle service
> ----------------------------------
>                 Key: MAPREDUCE-4049
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: performance, task, tasktracker
>    Affects Versions: 1.1.0, 1.0.3, 2.0.0, 3.0.0
>            Reporter: Avner BenHanoch
>              Labels: merge, plugin, rdma, shuffle
>         Attachments: HADOOP-1.0.2.patch, HADOOP-1.0.x.patch, HADOOP-1.0.x.patch, Hadoop
Shuffle Consumer Plugin TLD.rtf, Hadoop Shuffle Provider Plugin TLD.rtf, MAPREDUCE-4049-branch-1.0.2.patch,
mapred-site.xml, mapred.diff, src.tgz, test.diff
> Support generic shuffle service as set of two plugins: ShuffleProvider & ShuffleConsumer.
> This will satisfy the following needs:
> # Better shuffle and merge performance. For example: we are working on shuffle plugin
that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using
the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable
merge approach during the intermediate merges. Hence, getting much better performance.
> # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of
NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0).
> References:
> # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn
University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
> # I am attaching 2 documents with suggested Top Level Design for both plugins (currently,
based on 1.0 branch)

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message