hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-2028) streaming should support MultiFileInputFormat
Date Wed, 01 Dec 2010 21:15:12 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965849#action_12965849
] 

Allen Wittenauer commented on MAPREDUCE-2028:
---------------------------------------------

Actually, what should probably happen is that MultiFileWordCount's "MyInputFormat" and "MultiLineRecordRecord"
should get promoted out of examples and officially into the mapred(uce) APIs. 

The following appears to implement exactly what us streaming users want/need:

$HADOOP_HOME/bin/hadoop  \
        jar \
        `ls $HADOOP_HOME/contrib/streaming/hadoop-*-streaming.jar` \
        -libjars `ls $HADOOP_HOME/hadoop-*-examples.jar` \
        -inputformat org.apache.hadoop.examples.MultiFileWordCount\$MyInputFormat \
        -inputreader org.apache.hadoop.examples.MultiFileWordCount\$MultiFileLineRecordReader
\
        ....


> streaming should support MultiFileInputFormat
> ---------------------------------------------
>
>                 Key: MAPREDUCE-2028
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2028
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/streaming
>    Affects Versions: 0.20.2
>            Reporter: Allen Wittenauer
>             Fix For: 0.21.1, 0.22.0
>
>
> There should be a way to call MultiFileInputFormat from streaming without having to write
Java code...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message