hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Siqi Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-6275) Potential race condition of mutiple file system rename in FileOutputCommiter v2
Date Mon, 16 Mar 2015 21:36:38 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14364000#comment-14364000
] 

Siqi Li commented on MAPREDUCE-6275:
------------------------------------

In FileOutputCommiter v2, multiple reducers may call commitTask at the same time. If the outputs
of reducers have directory structure with the same name, it may encounter problems with v2.

For example, if reducerA has output Dir1/File1, reducerB has output Dir1/File2. Since final
output directory doesn't have Dir1, both reducers may call rename at the same time. This would
end up with weird output directory structure like(Dir1/Dir1/File1, or Dir1/Dir/File2).

Therefore, we should remove FileSystem.rename from FileOutputCommitter#mergePaths method.

> Potential race condition of mutiple file system rename in FileOutputCommiter v2
> -------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6275
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6275
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Siqi Li
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message