hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj K (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-4076) Stream job fails with ZipException when use yarn jar command
Date Thu, 29 Mar 2012 12:22:28 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-4076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Devaraj K updated MAPREDUCE-4076:

    Status: Patch Available  (was: Open)

When we use 'yarn jar' command, RunJar.java tries to create temp directory if doesn't exist
using configuration property "hadoop.tmp.dir". When it gets from the conf object, it will
get the value as ${hadoop.home.dir}/hadoop-${user.name}. Here these vars are not replaced
with system properties because of unavailability of 'hadoop.home.dir' system property. It
will create the temp dir with the same name(i.e ${hadoop.home.dir}/hadoop-${user.name}) in
the current dir. 

StreamJob unjars and keeps classes in the directory current-dir/${hadoop.home.dir}/hadoop-${user.name},
and then it tries to find "org/apache/hadoop/streaming/StreamJob.class" in the classpath and
it gets the path as  curent-dir/$%7Bhadoop.home.dir%7D/hadoop-$%7Buser.name%7D/hadoop-unjar8421477351848586067/
due to special chars in the directory name. And finally fails to merge from this path to the
job jar file.

If we do the same with 'hadoop jar', it will get the prop as $HADOOP_HOME/hadoop-username
which is replaced with 'hadoop.home.dir' and 'user.name' properties , it will create the temp
dir properly uses the same for other things to do and works fine. 

I have attached the patch to address the above problem by adding the hadoop.home.dir system
property in yarn file.
> Stream job fails with ZipException when use yarn jar command
> ------------------------------------------------------------
>                 Key: MAPREDUCE-4076
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4076
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 0.23.1
>            Reporter: Devaraj K
>            Assignee: Devaraj K
>            Priority: Critical
>         Attachments: MAPREDUCE-4076.patch
> Stream job fails with ZipException when use yarn jar command and executes successfully
with hadoop jar command.
> {code:xml}
> linux-f330:/home/devaraj/hadoop/trunk/hadoop-0.24.0-SNAPSHOT/bin # ./yarn jar ../share/hadoop/tools/lib/hadoop-streaming-0.24.0-SNAPSHOT.jar
-input /hadoop -output /test/output/1 -mapper cat -reducer wc
> packageJobJar: [] [/home/devaraj/hadoop/trunk/hadoop-0.24.0-SNAPSHOT/bin/$%7Bhadoop.home.dir%7D/hadoop-$%7Buser.name%7D/hadoop-unjar4241129353499211360/]
/tmp/streamjob7683981905208294893.jar tmpDir=null
> Exception in thread "main" java.io.IOException: java.util.zip.ZipException: ZIP file
must have at least one entry
>         at org.apache.hadoop.streaming.JarBuilder.merge(JarBuilder.java:82)
>         at org.apache.hadoop.streaming.StreamJob.packageJobJar(StreamJob.java:707)
>         at org.apache.hadoop.streaming.StreamJob.setJobConf(StreamJob.java:948)
>         at org.apache.hadoop.streaming.StreamJob.run(StreamJob.java:127)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>         at org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:50)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:200)
> {code}

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message