hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5791) Shuffle phase is slow in Windows - FadviseFileRegion::transferTo does not read disks efficiently
Date Thu, 13 Mar 2014 19:17:45 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933916#comment-13933916
] 

Chris Nauroth commented on MAPREDUCE-5791:
------------------------------------------

I think I found the root cause of the problem.  The JDK does not really implement a zero-copy
transfer on Windows.  I checked the source code for OpenJDK 6, 7 and 8, and they all look
like this in FileChannelImpl.c:

{code}
JNIEXPORT jlong JNICALL
Java_sun_nio_ch_FileChannelImpl_transferTo0(JNIEnv *env, jobject this,
                                            jint srcFD,
                                            jlong position, jlong count,
                                            jint dstFD)
{
    return IOS_UNSUPPORTED;
}
{code}

On Linux, these functions delegate to the {{sendfile}} syscall.  It's a shame that this isn't
available in the Windows JDK, because it's theoretically possible to do a zero-copy transfer
on Windows using {{TransmitFile}}:

http://msdn.microsoft.com/en-us/library/windows/desktop/ms740565(v=vs.85).aspx

I think it's fine to proceed with this buffer-copying patch, but I also wonder if we'd see
even better performance if we could figure out a JNI call to {{TransmitFile}}.

I'll review the patch in more detail later.  From a quick glance, it looked like there were
a few cases of indentation using 4 spaces instead of 2 spaces (the project standard).

> Shuffle phase is slow in Windows - FadviseFileRegion::transferTo does not read disks
efficiently
> ------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5791
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5791
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Nikola Vujic
>            Assignee: Nikola Vujic
>         Attachments: MAPREDUCE-5791.patch
>
>
> transferTo method in org.apache.hadoop.mapred.FadvisedFileRegion is using transferTo
method from a FileChannel to transfer data from a disk to socket. This is performing slow
in Windows, slower than in Linux. The reason is that transferTo method for the java.nio is
issuing 32K IO requests all the time. In Windows, these 32K transfers are not optimal and
we don't get the best performance form the underlying IO subsystem. In order to achieve better
performance when reading from the drives, we need to read data in bigger chunks, 512K for
example.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message