hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dieter Plaetinck (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HADOOP-7220) documentation lists options in wrong order
Date Tue, 12 Apr 2011 08:56:05 GMT
documentation lists options in wrong order

                 Key: HADOOP-7220
                 URL: https://issues.apache.org/jira/browse/HADOOP-7220
             Project: Hadoop Common
          Issue Type: Bug
            Reporter: Dieter Plaetinck
            Priority: Minor

On http://hadoop.apache.org/common/docs/r0.20.2/streaming.html various example use -D flags.

I noticed if you invoke hadoop this way, it won't work.

dplaetin@n-0:/usr/local/hadoop/bin$ ./hadoop jar /usr/local/hadoop/contrib/streaming/hadoop-0.20.2-streaming.jar
-file /proj/Search/wall/experiment/  -mapper './build-models.py --mapper'   -reducer './build-models.py
--reducer'   -input sim-input -output sim-output -D mapred.output.key.comparator.class=org.apache.hadoop.mapred.lib.KeyFieldBasedComparator
-D mapred.text.key.comparator.options=-k1,2n 
11/04/12 10:39:28 ERROR streaming.StreamJob: Unrecognized option: -D

Usage: $HADOOP_HOME/bin/hadoop jar \
          $HADOOP_HOME/hadoop-streaming.jar [options]
  -input    <path>     DFS input file(s) for the Map step
  -output   <path>     DFS output directory for the Reduce step
  -mapper   <cmd|JavaClassName>      The streaming command to run
  -combiner <JavaClassName> Combiner has to be a Java class
  -reducer  <cmd|JavaClassName>      The streaming command to run
  -file     <file>     File/dir to be shipped in the Job jar file
  -inputformat TextInputFormat(default)|SequenceFileAsTextInputFormat|JavaClassName Optional.
  -outputformat TextOutputFormat(default)|JavaClassName  Optional.
  -partitioner JavaClassName  Optional.
  -numReduceTasks <num>  Optional.
  -inputreader <spec>  Optional.
  -cmdenv   <n>=<v>    Optional. Pass env.var to streaming commands
  -mapdebug <path>  Optional. To run this script when a map task fails 
  -reducedebug <path>  Optional. To run this script when a reduce task fails 

Generic options supported are
-conf <configuration file>     specify an application configuration file
-D <property=value>            use value for given property
-fs <local|namenode:port>      specify a namenode
-jt <local|jobtracker:port>    specify a job tracker
-files <comma separated list of files>    specify comma separated files to be copied
to the map reduce cluster
-libjars <comma separated list of jars>    specify comma separated jar files to include
in the classpath.
-archives <comma separated list of archives>    specify comma separated archives to
be unarchived on the compute machines.

The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]

For more details about these options:
Use $HADOOP_HOME/bin/hadoop jar build/hadoop-streaming.jar -info

Streaming Job Failed!

I could only make it work by moving the '-D flags to the front' (right after the streaming.jar
part).  maybe because it's a generic option, it needs to be in front or something.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message