Return-Path: X-Original-To: apmail-hadoop-mapreduce-commits-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-commits-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 57162E500 for ; Wed, 6 Feb 2013 19:52:46 +0000 (UTC) Received: (qmail 73562 invoked by uid 500); 6 Feb 2013 19:52:46 -0000 Delivered-To: apmail-hadoop-mapreduce-commits-archive@hadoop.apache.org Received: (qmail 73506 invoked by uid 500); 6 Feb 2013 19:52:46 -0000 Mailing-List: contact mapreduce-commits-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-dev@hadoop.apache.org Delivered-To: mailing list mapreduce-commits@hadoop.apache.org Received: (qmail 73498 invoked by uid 99); 6 Feb 2013 19:52:46 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Feb 2013 19:52:46 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO eris.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Feb 2013 19:52:44 +0000 Received: from eris.apache.org (localhost [127.0.0.1]) by eris.apache.org (Postfix) with ESMTP id CD18823889CB; Wed, 6 Feb 2013 19:52:25 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: svn commit: r1443168 - in /hadoop/common/trunk/hadoop-mapreduce-project: CHANGES.txt hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/PluggableShuffleAndPluggableSort.apt.vm Date: Wed, 06 Feb 2013 19:52:25 -0000 To: mapreduce-commits@hadoop.apache.org From: tucu@apache.org X-Mailer: svnmailer-1.0.8-patched Message-Id: <20130206195225.CD18823889CB@eris.apache.org> X-Virus-Checked: Checked by ClamAV on apache.org Author: tucu Date: Wed Feb 6 19:52:25 2013 New Revision: 1443168 URL: http://svn.apache.org/viewvc?rev=1443168&view=rev Log: MAPREDUCE-4977. Documentation for pluggable shuffle and pluggable sort. (tucu) Added: hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/PluggableShuffleAndPluggableSort.apt.vm Modified: hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt Modified: hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt URL: http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt?rev=1443168&r1=1443167&r2=1443168&view=diff ============================================================================== --- hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt (original) +++ hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt Wed Feb 6 19:52:25 2013 @@ -230,6 +230,9 @@ Release 2.0.3-alpha - 2013-02-06 MAPREDUCE-4971. Minor extensibility enhancements to Counters & FileOutputFormat. (Arun C Murthy via sseth) + MAPREDUCE-4977. Documentation for pluggable shuffle and pluggable sort. + (tucu) + OPTIMIZATIONS MAPREDUCE-4893. Fixed MR ApplicationMaster to do optimal assignment of Added: hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/PluggableShuffleAndPluggableSort.apt.vm URL: http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/PluggableShuffleAndPluggableSort.apt.vm?rev=1443168&view=auto ============================================================================== --- hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/PluggableShuffleAndPluggableSort.apt.vm (added) +++ hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/site/apt/PluggableShuffleAndPluggableSort.apt.vm Wed Feb 6 19:52:25 2013 @@ -0,0 +1,96 @@ +~~ Licensed under the Apache License, Version 2.0 (the "License"); +~~ you may not use this file except in compliance with the License. +~~ You may obtain a copy of the License at +~~ +~~ http://www.apache.org/licenses/LICENSE-2.0 +~~ +~~ Unless required by applicable law or agreed to in writing, software +~~ distributed under the License is distributed on an "AS IS" BASIS, +~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +~~ See the License for the specific language governing permissions and +~~ limitations under the License. See accompanying LICENSE file. + + --- + Hadoop Map Reduce Next Generation-${project.version} - Pluggable Shuffle and Pluggable Sort + --- + --- + ${maven.build.timestamp} + +Hadoop MapReduce Next Generation - Pluggable Shuffle and Pluggable Sort + + \[ {{{./index.html}Go Back}} \] + +* Introduction + + The pluggable shuffle and pluggable sort capabilities allow replacing the + built in shuffle and sort logic with alternate implementations. Example use + cases for this are: using a different application protocol other than HTTP + such as RDMA for shuffling data from the Map nodes to the Reducer nodes; or + replacing the sort logic with custom algorithms that enable Hash aggregation + and Limit-N query. + + <> The pluggable shuffle and pluggable sort capabilities are + experimental and unstable. This means the provided APIs may change and break + compatibility in future versions of Hadoop. + +* Implementing a Custom Shuffle and a Custom Sort + + A custom shuffle implementation requires a + <<>> + implementation class running in the NodeManagers and a + <<>> implementation class + running in the Reducer tasks. + + The default implementations provided by Hadoop can be used as references: + + * <<>> + + * <<>> + + A custom sort implementation requires a <<>> + implementation class running in the Mapper tasks and (optionally, depending + on the sort implementation) a <<>> + implementation class running in the Reducer tasks. + + The default implementations provided by Hadoop can be used as references: + + * <<>> + + * <<>> + +* Configuration + + Except for the auxiliary service running in the NodeManagers serving the + shuffle (by default the <<>>), all the pluggable components + run in the job tasks. This means, they can be configured on per job basis. + The auxiliary service servicing the Shuffle must be configured in the + NodeManagers configuration. + +** Job Configuration Properties (on per job basis): + +*--------------------------------------+---------------------+-----------------+ +| <> | <> | <> | +*--------------------------------------+---------------------+-----------------+ +| <<>> | <<>> | The <<>> implementation to use | +*--------------------------------------+---------------------+-----------------+ +| <<>> | <<>> | The <<>> implementation to use | +*--------------------------------------+---------------------+-----------------+ + + These properties can also be set in the <<>> to change the default values for all jobs. + +** NodeManager Configuration properties, <<>> in all nodes: + +*--------------------------------------+---------------------+-----------------+ +| <> | <> | <> | +*--------------------------------------+---------------------+-----------------+ +| <<>> | <<<...,mapreduce.shuffle>>> | The auxiliary service name | +*--------------------------------------+---------------------+-----------------+ +| <<>> | <<>> | The auxiliary service class to use | +*--------------------------------------+---------------------+-----------------+ + + <> If setting an auxiliary service in addition the default + <<>> service, then a new service key should be added to the + <<>> property, for example <<>>. + Then the property defining the corresponding class must be + <<>>. + \ No newline at end of file