hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nikola Vujic (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-5791) Shuffle phase is slow in Windows - FadviseFileRegion::transferTo does not read disks efficiently
Date Tue, 11 Mar 2014 14:08:01 GMT
Nikola Vujic created MAPREDUCE-5791:
---------------------------------------

             Summary: Shuffle phase is slow in Windows - FadviseFileRegion::transferTo does
not read disks efficiently
                 Key: MAPREDUCE-5791
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5791
             Project: Hadoop Map/Reduce
          Issue Type: Bug
            Reporter: Nikola Vujic
            Assignee: Nikola Vujic


transferTo method in org.apache.hadoop.mapred.FadvisedFileRegion is using transferTo method
from a FileChannel to transfer data from a disk to socket. This is performing slow in Windows,
slower than in Linux. The reason is that transferTo method for the java.nio is issuing 32K
IO requests all the time. In Windows, these 32K transfers are not optimal and we don't get
the best performance form the underlying IO subsystem. In order to achieve better performance
when reading from the drives, we need to read data in bigger chunks, 512K for example.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message