Return-Path: X-Original-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 658CB96D9 for ; Wed, 18 Apr 2012 17:27:10 +0000 (UTC) Received: (qmail 7936 invoked by uid 500); 18 Apr 2012 17:27:10 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 7891 invoked by uid 500); 18 Apr 2012 17:27:10 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 7883 invoked by uid 99); 18 Apr 2012 17:27:10 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Apr 2012 17:27:10 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Apr 2012 17:27:03 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 7D68D3A1AC7 for ; Wed, 18 Apr 2012 17:26:42 +0000 (UTC) Date: Wed, 18 Apr 2012 17:26:42 +0000 (UTC) From: "Mariappan Asokan (Commented) (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: <1644038589.1583.1334770002515.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1070495047.41402.1332332379648.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256720#comment-13256720 ] Mariappan Asokan commented on MAPREDUCE-4049: --------------------------------------------- Hi Avner, Following are my additional comments and thoughts: * Does someone need to implement {{ShuffleProviderPlugin}} and {{ShuffleConsumerPlugin}} as a pair? Or can one of them be implemented and the other be left to the default implementaion? * For the trunk version, the shuffle provider will probably not require any changes. The shuffle consumer on the other hand has to be separated into two parts: one that deals with tansferring map output data across the network and the other that consumes the transferred data. If you look at the patch I posted in MAPREDUCE-2454, the above are abstracted as {{ShuffleRunner}} and {{ShuffleCallback}} interfaces respectively. In the refactored code as part of the patch, {{MergeManager}} class implements {{ShuffleCallback}} and {{Shuffle}} class implements {{ShuffleRunner}}. I can probably enhance the {{ShuffleRunner}} interface as below with an added {{initialize()}} method which basically gets all the arguments that current {{Shuffle}} constructor gets: {code:title=ShuffleRunner.java} public interface ShuffleRunner extends ExceptionReporter { public void initialize(TaskAttemptID reduceId, JobConf jobConf, TaskUmbilicalProtocol umbilical, Reporter reporter, Counters.Counter shuffledMapsCounter, Counters.Counter reduceShuffleBytes, Counters.Counter failedShuffleCounter, TaskStatus status, Progress copyPhase, Task reduceTask); public void run(ShuffleCallback shuffleCallback) throws IOException, InterruptedException; } {code} * If you want to implement your own merge(using RDMA to get shuffled data as described in the technical paper attached to this Jira) you can implement the interface {{ReduceSortPlugin}} in addition to {{ShuffleRunner}}. In the {{ReduceSortPlugin}} your merge class can implement {{ShuffleCallback}}. Currently, since the merge and shuffle are running in separate threads, the synchronization is done by {{waitForResource()}} method in {{ShuffleCallback}}. If you are using RDMA to fetch map outputs, your merge will be in full control. It has to coordinate with your implementation of {{ShuffleRunner}} to fetch specific mapper output you want in the merge. If you have any questions, please let me know. > plugin for generic shuffle service > ---------------------------------- > > Key: MAPREDUCE-4049 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: performance, task, tasktracker > Affects Versions: 1.0.3 > Reporter: Avner BenHanoch > Labels: merge, plugin, rdma, shuffle > Attachments: HADOOP-1.0.2.patch, HADOOP-1.0.x.patch, HADOOP-1.0.x.patch, Hadoop Shuffle Consumer Plugin TLD.rtf, Hadoop Shuffle Provider Plugin TLD.rtf, MAPREDUCE-4049-branch-1.0.2.patch, mapred-site.xml, mapred.diff, src.tgz, test.diff > > > Support generic shuffle service as set of two plugins: ShuffleProvider & ShuffleConsumer. > This will satisfy the following needs: > # Better shuffle and merge performance. For example: we are working on shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable merge approach during the intermediate merges. Hence, getting much better performance. > # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0). > References: > # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] > # I am attaching 2 documents with suggested Top Level Design for both plugins (currently, based on 1.0 branch) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira