hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amareshwari Sriramadasu (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1641) Job submission should fail if same uri is added for mapred.cache.files and mapred.cache.archives
Date Mon, 26 Apr 2010 04:28:32 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12860802#action_12860802

Amareshwari Sriramadasu commented on MAPREDUCE-1641:

The following code change in JobClient does not look correct 
@@ -767,6 +766,9 @@ public class JobClient extends Configured implements MRConstants, Tool
                (new Path("file:///" +  binaryTokenFilename), jobCopy);

+          // First we check whether the cached archives and files are legal.
+          TrackerDistributedCacheManager.validate(jobCopy);
           copyAndConfigureFiles(jobCopy, submitJobDir);

copyAndConfigureFiles adds files/archives given for command line options: -files, -archives,
-libjars. So, the patch does not validate these files. Validate should happen after the call
to copyAndConfigureFiles.
A test with same file added for -files and -archives option would fail with the patch.

> Job submission should fail if same uri is added for mapred.cache.files and mapred.cache.archives
> ------------------------------------------------------------------------------------------------
>                 Key: MAPREDUCE-1641
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1641
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: distributed-cache
>            Reporter: Amareshwari Sriramadasu
>            Assignee: Dick King
>             Fix For: 0.22.0
>         Attachments: BZ-3539321--off-0-20-101--2010-04-20.patch, duped-files-archives--off-0-20-101--2010-04-21.patch,
> The behavior of mapred.cache.files and mapred.cache.archives is different during localization
in the following way:
> If a jar file is added to mapred.cache.files,  it will be localized under TaskTracker
under a unique path. 
> If a jar file is added to mapred.cache.archives, it will be localized under a unique
path in a directory named the jar file name, and will be unarchived under the same directory.
> If same jar file is passed for both the configurations, the behavior undefined. Thus
the job submission should fail.
> Currently, since distributed cache processes files before archives, the jar file will
be just localized and not unarchived.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message