hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Joseph Evans (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4933) MR1 final merge asks for length of file it just wrote before flushing it
Date Thu, 10 Jan 2013 22:28:12 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550509#comment-13550509
] 

Robert Joseph Evans commented on MAPREDUCE-4933:
------------------------------------------------

Can we really lose map data? If so this is a Blocker not a Major.
                
> MR1 final merge asks for length of file it just wrote before flushing it
> ------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4933
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4933
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv1, task
>    Affects Versions: 1.1.1
>            Reporter: Sandy Ryza
>            Assignee: Sandy Ryza
>         Attachments: MAPREDUCE-4933-branch-1.patch
>
>
> createKVIterator in ReduceTask contains the following code:
> {code}
>           try {
>             Merger.writeFile(rIter, writer, reporter, job);
>             addToMapOutputFilesOnDisk(fs.getFileStatus(outputPath));
>           } catch (Exception e) {
>             if (null != outputPath) {
>               fs.delete(outputPath, true);
>             }
>             throw new IOException("Final merge failed", e);
>           } finally {
>             if (null != writer) {
>               writer.close();
>             }
>           }
> {code}
> Merger#writeFile() does not close the file after writing it, so when fs.getFileStatus()
is called on it, it may not return the correct length.  This causes bad accounting further
down the line, which can lead to map output data being lost.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message