hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5791) Shuffle phase is slow in Windows - FadviseFileRegion::transferTo does not read disks efficiently
Date Mon, 24 Mar 2014 19:06:50 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945548#comment-13945548
] 

Hudson commented on MAPREDUCE-5791:
-----------------------------------

SUCCESS: Integrated in Hadoop-trunk-Commit #5389 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5389/])
MAPREDUCE-5791. Shuffle phase is slow in Windows - FadviseFileRegion::transferTo does not
read disks efficiently. Contributed by Nikola Vujic. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1580994)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/FadvisedFileRegion.java
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/main/java/org/apache/hadoop/mapred/ShuffleHandler.java
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle/src/test/java/org/apache/hadoop/mapred/TestFadvisedFileRegion.java


> Shuffle phase is slow in Windows - FadviseFileRegion::transferTo does not read disks
efficiently
> ------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5791
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5791
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 3.0.0, 2.3.0
>            Reporter: Nikola Vujic
>            Assignee: Nikola Vujic
>         Attachments: MAPREDUCE-5791.patch, MAPREDUCE-5791.patch, MAPREDUCE-5791.patch
>
>
> transferTo method in org.apache.hadoop.mapred.FadvisedFileRegion is using transferTo
method from a FileChannel to transfer data from a disk to socket. This is performing slow
in Windows, slower than in Linux. The reason is that transferTo method for the java.nio is
issuing 32K IO requests all the time. In Windows, these 32K transfers are not optimal and
we don't get the best performance form the underlying IO subsystem. In order to achieve better
performance when reading from the drives, we need to read data in bigger chunks, 512K for
example.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message