hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Gao <steve....@yahoo.com>
Subject [Help needed] Is there a way to know the input filename at Hadoop Streaming?
Date Thu, 23 Oct 2008 17:48:11 GMT
Sorry for the email. Thanks for any help or hint.

    I am using Hadoop Streaming. The input are multiple files.
    Is there a way to get the current filename in mapper?

    For example:
    $HADOOP_HOME/bin/hadoop  \
    jar $HADOOP_HOME/hadoop-streaming.jar \
        -input file1 \
        -input file2 \
        -output myOutputDir \
        -mapper mapper \
        -reducer reducer

    In mapper:
    while (<STDIN>){
      //how to tell the current line is from file1 or file2?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message