hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-3252) MR2: Map tasks rewrite data once even if output fits in sort buffer
Date Tue, 25 Oct 2011 13:18:33 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13135016#comment-13135016
] 

Hudson commented on MAPREDUCE-3252:
-----------------------------------

Integrated in Hadoop-Mapreduce-trunk #871 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/871/])
    MAPREDUCE-3252. Fix map tasks to not rewrite data an extra time when map output fits in
spill buffer. Contributed by Todd Lipcon.

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1188424
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/MapTask.java

                
> MR2: Map tasks rewrite data once even if output fits in sort buffer
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3252
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3252
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, task
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.23.0
>
>         Attachments: mr-3252.txt, mr-3252.txt
>
>
> I found that, even if the output of a map task fits entirely in its sort buffer, it was
rewriting the output entirely rather than just renaming the first spill into place. This is
due to RawLocalFileSystem.rename() falling back to a copy if renameTo() fails. The first rename
attempt was failing because no one has called mkdir for the output directory yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message