hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amareshwari Sriramadasu (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (MAPREDUCE-440) -archives option in JobConf doesn't support symlink for an uncompressed archive directory
Date Fri, 07 May 2010 05:25:48 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Amareshwari Sriramadasu resolved MAPREDUCE-440.

    Resolution: Duplicate

Fixed by MAPREDUCE-787

> -archives option in JobConf doesn't support symlink for an uncompressed archive directory
> -----------------------------------------------------------------------------------------
>                 Key: MAPREDUCE-440
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-440
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Derek Wollenstein
>            Priority: Minor
> According to http://hadoop.apache.org/core/docs/r0.19.1/streaming.html#Large+files+and+archives+in+Hadoop+Streaming,
it should be possible to have an archive uncompressed into the working directory of a job
with a given alias.  The documentation here says that
> "The -archives option allows you to copy jars locally to the cwd of tasks and automatically
unjar the files. For example:
> -archives hdfs://host:fs_port/user/testfile.jar#testlink3
> In the example above, a symlink testlink3 is created in the current working directory
of tasks. This symlink points to the directory that stores the unjarred contents of the uploaded
jar file. "
> This feature currently breaks because the entires string, including the alias, is validated
as a filename by the GenericOptionsParser
> I've pasted a stacktrace ( with modified filenames/hosts) below
> java.io.FileNotFoundException: File hdfs://host:fs_port/user/testfile.jar#testlink3 does
not exist.
>         at org.apache.hadoop.util.GenericOptionsParser.validateFiles(GenericOptionsParser.java:319)
>         at org.apache.hadoop.util.GenericOptionsParser.processGeneralOptions(GenericOptionsParser.java:247)
>         at org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:345)
>         at org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:136)
>         at org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:121)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:59)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>         at org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:32)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
>         at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>         at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
> This breaks a number of jobs that worked with the cacheArchives option in hadoop streaming.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message