Return-Path: X-Original-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7920411FB2 for ; Mon, 24 Mar 2014 19:13:15 +0000 (UTC) Received: (qmail 28490 invoked by uid 500); 24 Mar 2014 19:12:59 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 28235 invoked by uid 500); 24 Mar 2014 19:12:51 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 27361 invoked by uid 99); 24 Mar 2014 19:12:49 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 24 Mar 2014 19:12:49 +0000 Date: Mon, 24 Mar 2014 19:12:48 +0000 (UTC) From: "Chris Nauroth (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (MAPREDUCE-5791) Shuffle phase is slow in Windows - FadviseFileRegion::transferTo does not read disks efficiently MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAPREDUCE-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-5791: ------------------------------------- Resolution: Fixed Fix Version/s: 2.4.0 3.0.0 Status: Resolved (was: Patch Available) I committed this to trunk, branch-2 and branch-2.4. Nikola, thank you for reporting the issue and contributing a patch. > Shuffle phase is slow in Windows - FadviseFileRegion::transferTo does not read disks efficiently > ------------------------------------------------------------------------------------------------ > > Key: MAPREDUCE-5791 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5791 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client > Affects Versions: 3.0.0, 2.3.0 > Reporter: Nikola Vujic > Assignee: Nikola Vujic > Fix For: 3.0.0, 2.4.0 > > Attachments: MAPREDUCE-5791.patch, MAPREDUCE-5791.patch, MAPREDUCE-5791.patch > > > transferTo method in org.apache.hadoop.mapred.FadvisedFileRegion is using transferTo method from a FileChannel to transfer data from a disk to socket. This is performing slow in Windows, slower than in Linux. The reason is that transferTo method for the java.nio is issuing 32K IO requests all the time. In Windows, these 32K transfers are not optimal and we don't get the best performance form the underlying IO subsystem. In order to achieve better performance when reading from the drives, we need to read data in bigger chunks, 512K for example. -- This message was sent by Atlassian JIRA (v6.2#6252)