Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 30F702004A1 for ; Thu, 10 Aug 2017 00:43:06 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 2F79416A3A0; Wed, 9 Aug 2017 22:43:06 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 765DC16A39D for ; Thu, 10 Aug 2017 00:43:05 +0200 (CEST) Received: (qmail 51892 invoked by uid 500); 9 Aug 2017 22:43:04 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 51881 invoked by uid 99); 9 Aug 2017 22:43:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Aug 2017 22:43:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id F376B180854 for ; Wed, 9 Aug 2017 22:43:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id g48vhfk-A6aG for ; Wed, 9 Aug 2017 22:43:03 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 77AC05F569 for ; Wed, 9 Aug 2017 22:43:02 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 3A730E0E66 for ; Wed, 9 Aug 2017 22:43:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 5DE342417D for ; Wed, 9 Aug 2017 22:43:00 +0000 (UTC) Date: Wed, 9 Aug 2017 22:43:00 +0000 (UTC) From: "Ravi Prakash (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (MAPREDUCE-6923) Optimize MapReduce Shuffle I/O for small partitions MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 09 Aug 2017 22:43:06 -0000 [ https://issues.apache.org/jira/browse/MAPREDUCE-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated MAPREDUCE-6923: ------------------------------------ Resolution: Fixed Fix Version/s: 3.0.0-beta1 2.9.0 Status: Resolved (was: Patch Available) Committed to branch-2 and trunk. Thanks a lot for your contribution Robert! Good luck with your research. I hope to hear back from you when you publish. And look forward to more valuable contributions from you. > Optimize MapReduce Shuffle I/O for small partitions > --------------------------------------------------- > > Key: MAPREDUCE-6923 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6923 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Environment: Observed in Hadoop 2.7.3 and above (judging from the source code of future versions), and Ubuntu 16.04. > Reporter: Robert Schmidtke > Assignee: Robert Schmidtke > Fix For: 2.9.0, 3.0.0-beta1 > > Attachments: MAPREDUCE-6923.00.patch, MAPREDUCE-6923.01.patch > > > When a job configuration results in small partitions read by each reducer from each mapper (e.g. 65 kilobytes as in my setup: a [TeraSort|https://github.com/apache/hadoop/blob/branch-2.7.3/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/main/java/org/apache/hadoop/examples/terasort/TeraSort.java] of 256 gigabytes using 2048 mappers and reducers each), and setting > {code:xml} > > mapreduce.shuffle.transferTo.allowed > false > > {code} > then the default setting of > {code:xml} > > mapreduce.shuffle.transfer.buffer.size > 131072 > > {code} > results in almost 100% overhead in reads during shuffle in YARN, because for each 65K needed, 128K are read. > I propose a fix in [FadvisedFileRegion.java|https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/FadvisedFileRegion.java#L114] as follows: > {code:java} > ByteBuffer byteBuffer = ByteBuffer.allocate(Math.min(this.shuffleBufferSize, trans > Integer.MAX_VALUE ? Integer.MAX_VALUE : (int) trans)); > {code} > e.g. [here|https://github.com/apache/hadoop/compare/branch-2.7.3...robert-schmidtke:adaptive-shuffle-buffer]. This sets the shuffle buffer size to the minimum value of the shuffle buffer size specified in the configuration (128K by default), and the actual partition size (65K on average in my setup). In my benchmarks this reduced the read overhead in YARN from about 100% (255 additional gigabytes as described above) down to about 18% (an additional 45 gigabytes). The runtime of the job remained the same in my setup. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org