hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nikola Vujic (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5791) Shuffle phase is slow in Windows - FadviseFileRegion::transferTo does not read disks efficiently
Date Mon, 24 Mar 2014 17:44:45 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945407#comment-13945407

Nikola Vujic commented on MAPREDUCE-5791:

Hi [~cnauroth],

I have applied all fixes except for the if-else in {{FadvisedFileRegion}}. Edge case is reading
the last chunk of data from a file. {{customShuffleTransfer}} must read {{actualCount}} bytes
from a file, starting from the {{this.position}}. This is done in the while loop and {{trans}}
variable is used to calculate the number of remaining bytes. {{fileChannel.read}} returns
the number of bytes read. For the last chunk of data this number can be higher than the remaining
number of bytes to read. In that case we cannot use {{Buffer#flip}}. 

For example, let's suppose that we have 128 byte buffer and the we want to read 200 bytes
starting at position 1000 in a file (file size bigger than 1256 bytes). At least two iterations
of the while loop will be done: 
1. Iteration 1: {{fileChannel.read(byteBuffer, 1000+0)}} => 128 bytes are read => all
128 bytes are needed => target.write
2. Iteration 2: {{fileChannel.read(byteBuffer, 1000+128)}} => 128 bytes are read =>
128 bytes are read because file is big enough but only first 72 bytes are needed => {{byteBuffer.limit(72)}}
=> target.write

In the else block we don't set limit to the current position but to a number lower than the
current position. Updating local {{position}} variable is needed in order to read data starting
from a proper position in the next iterations of the loop. Does it make sense?

Regarding the resource leak in the test, I applied a change you suggested and I did the same
with the {{fileRegion}} in order to eliminated one try block.

I changed {{customShuffleTransferCornerCases}} to private. It was public.

> Shuffle phase is slow in Windows - FadviseFileRegion::transferTo does not read disks
> ------------------------------------------------------------------------------------------------
>                 Key: MAPREDUCE-5791
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5791
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Nikola Vujic
>            Assignee: Nikola Vujic
>         Attachments: MAPREDUCE-5791.patch, MAPREDUCE-5791.patch
> transferTo method in org.apache.hadoop.mapred.FadvisedFileRegion is using transferTo
method from a FileChannel to transfer data from a disk to socket. This is performing slow
in Windows, slower than in Linux. The reason is that transferTo method for the java.nio is
issuing 32K IO requests all the time. In Windows, these 32K transfers are not optimal and
we don't get the best performance form the underlying IO subsystem. In order to achieve better
performance when reading from the drives, we need to read data in bigger chunks, 512K for

This message was sent by Atlassian JIRA

View raw message