Return-Path: X-Original-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A03629BF4 for ; Mon, 26 Mar 2012 06:30:55 +0000 (UTC) Received: (qmail 45480 invoked by uid 500); 26 Mar 2012 06:30:55 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 45424 invoked by uid 500); 26 Mar 2012 06:30:55 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 45399 invoked by uid 99); 26 Mar 2012 06:30:55 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Mar 2012 06:30:55 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Mar 2012 06:30:53 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id B7F6F346233 for ; Mon, 26 Mar 2012 06:30:33 +0000 (UTC) Date: Mon, 26 Mar 2012 06:30:33 +0000 (UTC) From: "Arun C Murthy (Commented) (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: <1487470762.16294.1332743433755.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <509801353.35624.1332230385964.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (MAPREDUCE-4039) Sort Avoidance MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/MAPREDUCE-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238138#comment-13238138 ] Arun C Murthy commented on MAPREDUCE-4039: ------------------------------------------ Anty - I spent a bit of time thinking of this. Some early thoughts... seems to me that we are better of doing a bit of surgery on the MR runtime before we do this. We could consider making the MapOutputBuffer pluggable for a start so we can split the 'full sort' and the 'hash sort' implementations. Similarly, we could implement a pluggable Shuffle to not block until outputs of all maps are not available. This way we can cleanly layer the necessary features. Thoughts? > Sort Avoidance > -------------- > > Key: MAPREDUCE-4039 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4039 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: mrv2 > Affects Versions: 0.23.2 > Reporter: anty.rao > Assignee: anty > Priority: Minor > Fix For: 0.23.2 > > Attachments: MAPREDUCE-4039-branch-0.23.2.patch, MAPREDUCE-4039-branch-0.23.2.patch > > > Inspired by [Tenzing|http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//pubs/archive/37200.pdf], in 5.1 MapReduce Enhanceemtns: > {quote}*Sort Avoidance*. Certain operators such as hash join > and hash aggregation require shuffling, but not sorting. The > MapReduce API was enhanced to automatically turn off > sorting for these operations. When sorting is turned off, the > mapper feeds data to the reducer which directly passes the > data to the Reduce() function bypassing the intermediate > sorting step. This makes many SQL operators significantly > more ecient.{quote} > There are a lot of applications which need aggregation only, not sorting.Using sorting to achieve aggregation is costly and inefficient. Without sorting, up application can make use of hash table or hash map to do aggregation efficiently.But application should bear in mind that reduce memory is limited, itself is committed to manage memory of reduce, guard against out of memory. Map-side combiner is not supported, you can also do hash aggregation in map side as a workaround. > the following is the main points of sort avoidance implementation > # add a configuration parameter ??mapreduce.sort.avoidance??, boolean type, to turn on/off sort avoidance workflow.Two type of workflow are coexist together. > # key/value pairs emitted by map function is sorted by partition only, using a more efficient sorting algorithm: counting sort. > # map-side merge, use a kind of byte merge, which just concatenate bytes from generated spills, read in bytes, write out bytes, without overhead of key/value serialization/deserailization, comparison, which current version incurs. > # reduce can start up as soon as there is any map output available, in contrast to sort workflow which must wait until all map outputs are fetched and merged. > # map output in memory can be directly consumed by reduce.When reduce can't catch up with the speed of incoming map outputs, in-memory merge thread will kick in, merging in-memory map outputs onto disk. > # sequentially read in on-disk files to feed reduce, in contrast to currently implementation which read multiple files concurrently, result in many disk seek. Map output in memory take precedence over on disk files in feeding reduce function. > I have already implement this feature based on hadoop CDH3U3 and done some performance evaluation, you can reference to [https://github.com/hanborq/hadoop] for details. Now,I'm willing to port it into yarn. Welcome for commenting. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira