hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Joseph Evans (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4568) Throw "early" exception when duplicate files or archives are found in distributed cache
Date Wed, 10 Oct 2012 14:49:02 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473271#comment-13473271

Robert Joseph Evans commented on MAPREDUCE-4568:

Adding a true duplicate, exact same file multiple times, to the dist cache will not result
in an error under YARN.  The MR client will just dedupe them before submitting the request
to YARN.  The issue is when there are different files that will both map to the same key in
the dist cache map (the key is the name of the symlink created in the working directory of
the task/container).  Then is where it will throw an exception under 2.0
> Throw "early" exception when duplicate files or archives are found in distributed cache
> ---------------------------------------------------------------------------------------
>                 Key: MAPREDUCE-4568
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4568
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Mohammad Kamrul Islam
>            Assignee: Arun C Murthy
> According to #MAPREDUCE-4549, Hadoop 2.x throws exception if duplicates found in cacheFiles
or cacheArchives. The exception  throws during job submission.
> This JIRA is to throw the exception ==early== when it is first added to the Distributed
Cache through addCacheFile or addFileToClassPath.
> It will help the client to decide whether to fail-fast or continue w/o the duplicated
> Alternatively, Hadoop could provide a knob where user will choose whether to throw error(
coming behavior) or silently ignore (old behavior).

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message