hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amareshwari Sriramadasu (JIRA)" <j...@apache.org>
Subject [jira] Created: (MAPREDUCE-1697) Document the behavior of -file option in streaming
Date Tue, 13 Apr 2010 05:04:50 GMT
Document the behavior of -file option in streaming

                 Key: MAPREDUCE-1697
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1697
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: contrib/streaming, documentation
    Affects Versions: 0.20.1
            Reporter: Amareshwari Sriramadasu

The behavior of -file option in streaming is not documented anywhere.
The behavior of -file is the following :
1) All the files passed through  -file option are packaged into job.jar.
2) If -file option is used for .class or .jar files, they are unjarred on tasktracker and
placed in ${mapred.local.dir}/taskTracker/jobcache/job_ID/jars/classes or /lib, respectively.
Symlinks to the directories classes and lib are created from the cwd of the task, . The names
of symlinks are "classes", "lib". So file names of .class or .jar files do not appear in cwd
of the task. 
Paths to these files are automatically added to classpath. The tricky part is that hadoop
framework can pick .class or .jar using classpath, but actual mapper script cannot. If you'd
like to access these .class or .jar inside script, please do something like "java -cp lib/*;classes/*
3) If -file option is used for files other than .class or .jar (e.g, .txt or .pl), these files
are unjarred into ${mapred.local.dir}/taskTracker/jobcache/job_ID/jars/. Symlinks to these
files are created from the cwd of the task. Names of these symlinks are actually file names.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message