hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Avner BenHanoch (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service
Date Thu, 22 Mar 2012 15:28:22 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235637#comment-13235637
] 

Avner BenHanoch commented on MAPREDUCE-4049:
--------------------------------------------

# I just attached top level design documents for both plugins, based on current 1.0 branch.
# I want to have the described plugin-ability (desired with same interface) for all future
versions of Hadoop (as mentioned in the Target Version/s field).
# On the first phase, I am focusing on the existing 1.0 branch as I know it.  In parallel,
I'll try to learn what exists in 0.23 or in HDH.  Later, I'll be glad to do any adjustments
that may be needed for fulfilling the previous point (as much as possible).
# I welcome you and other parties, to let me know if there are branch(es) that particularly
interest you, so I'll give it priority.
                
> plugin for generic shuffle service
> ----------------------------------
>
>                 Key: MAPREDUCE-4049
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: performance, task, tasktracker
>    Affects Versions: 0.23.1, 1.0.1
>            Reporter: Avner BenHanoch
>              Labels: merge, plugin, rdma, shuffle
>         Attachments: Hadoop Shuffle Consumer Plugin TLD.rtf, Hadoop Shuffle Provider
Plugin TLD.rtf
>
>
> Support generic shuffle service as set of two plugins: ShuffleProvider & ShuffleConsumer.
> This will satisfy the following needs:
> # Better shuffle and merge performance. For example: we are working on shuffle plugin
that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using
the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable
merge approach during the intermediate merges. Hence, getting much better performance.
> # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of
NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0).
> References:
> # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn
University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf]
> # I will soon attach document with suggested API for the plugin

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message