hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michel Tourn (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-191) add hadoopStreaming to src/contrib
Date Tue, 16 May 2006 20:58:08 GMT
     [ http://issues.apache.org/jira/browse/HADOOP-191?page=all ]

Michel Tourn updated HADOOP-191:
--------------------------------

    Attachment: streaming.3.patch

This patch depends on the LargeUTF8 patch:  http://issues.apache.org/jira/browse/HADOOP-136


Added a few more configurable options.

michel@cdev2004> bin/hadoop jar build/hadoop-streaming.jar -info
Usage: $HADOOP_HOME/bin/hadoop jar build/hadoop-streaming.jar [options]
Options:
  -input    <path>     DFS input file(s) for the Map step
  -output   <path>     DFS output directory for the Reduce step
  -mapper   <cmd>      The streaming command to run
  -combiner <cmd>      Not implemented. But you can pipe the mapper output
  -reducer  <cmd>      The streaming command to run
  -file     <file>     File/dir to be shipped in the Job jar file
  -cluster  <name>     Default uses hadoop-default.xml and hadoop-site.xml
  -config   <file>     Optional. One or more paths to xml config files
  -dfs      <h:p>      Optional. Override DFS configuration
  -jt       <h:p>      Optional. Override JobTracker configuration
  -inputreader <spec>  Optional.
  -jobconf  <n>=<v>    Optional.
  -cmdenv   <n>=<v>    Optional. Pass env.var to streaming commands
  -verbose

For more details about these options:
Use $HADOOP_HOME/bin/hadoop jar build/hadoop-streaming.jar -info


> add hadoopStreaming to src/contrib
> ----------------------------------
>
>          Key: HADOOP-191
>          URL: http://issues.apache.org/jira/browse/HADOOP-191
>      Project: Hadoop
>         Type: New Feature

>     Reporter: Michel Tourn
>     Assignee: Doug Cutting
>      Fix For: 0.2
>  Attachments: streaming.2.patch, streaming.3.patch, streaming.patch
>
> This is a patch that adds a src/contrib/hadoopStreaming directory to the source tree.
> hadoopStreaming is a bridge to run non-Java code as Map/Reduce tasks.
> The unit test TestStreaming runs the Unix tools tr (as Map) and uniq (as Reduce)
> TO test the patch: 
> Merge the patch. 
> The only existing file that is modified is trunk/build.xml
> trunk>ant deploy-contrib
> trunk>bin/hadoopStreaming : should show usage message
> trunk>ant test-contrib    : should run one test successfully
> TO add src/contrib/someOtherProject:
> edit src/contrib/build.xml

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message