hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4568) Throw "early" exception when duplicate files or archives are found in distributed cache
Date Wed, 10 Oct 2012 15:05:03 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473277#comment-13473277
] 

Jason Lowe commented on MAPREDUCE-4568:
---------------------------------------

bq. In addition, it will be better, if there is a way of checking whether some file is already
added in DC.

Would adding an interface so the client can query the contents of the DC before job submission
be sufficient?  This seems like a reasonable enhancement that doesn't overlap with existing
interfaces.  Or do you think it's still a requirement to throw early when adding a collision?
 Throwing will require adding a new interface for adding to the DC which overlaps with existing
functionality and adds to the pile of APIs we already have for adding things to the DC.
                
> Throw "early" exception when duplicate files or archives are found in distributed cache
> ---------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4568
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4568
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Mohammad Kamrul Islam
>            Assignee: Arun C Murthy
>
> According to #MAPREDUCE-4549, Hadoop 2.x throws exception if duplicates found in cacheFiles
or cacheArchives. The exception  throws during job submission.
> This JIRA is to throw the exception ==early== when it is first added to the Distributed
Cache through addCacheFile or addFileToClassPath.
> It will help the client to decide whether to fail-fast or continue w/o the duplicated
entries.
> Alternatively, Hadoop could provide a knob where user will choose whether to throw error(
coming behavior) or silently ignore (old behavior).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message