hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Runping Qi <runp...@yahoo-inc.com>
Subject Re: Is there a way to know the input filename at Hadoop Streaming?
Date Sun, 26 Oct 2008 21:18:45 GMT

Each mapper works on only one file split, which is either from file1 or
file2 in your case. So the value for map.input.file gives you the exact
information you need.


Runping
 


On 10/23/08 11:09 AM, "Steve Gao" <steve.gao@yahoo.com> wrote:

> Thanks, Amogh. But my case is slightly different. The command line inputs are
> 2 files: file1 and file2. I need to tell in the mapper which line is from
> which file:
> #In mapper
> while (<STDIN>){
>   //how to tell the current line is from file1 or file2?
> }
> 
> -jobconfs map.input.file param does not help in this case
> because file1 and file2 are both input.
> 
> -Steve
> 
> --- On Thu, 10/23/08, Amogh Vasekar <vasekar@yahoo-inc.com> wrote:
> From: Amogh Vasekar <vasekar@yahoo-inc.com>
> Subject: RE: Is there a way to know the input filename at Hadoop Streaming?
> To: steve.gao@yahoo.com
> Date: Thursday, October 23, 2008, 12:11 AM
> 
> Personally haven't worked with streaming but I guess the ur jobconfs
> map.input.file param should do it for you.
> -----Original Message-----
> From: Steve Gao [mailto:steve.gao@yahoo.com]
> Sent: Thursday, October 23, 2008 7:26 AM
> To: core-user@hadoop.apache.org
> Cc: core-dev@hadoop.apache.org
> Subject: Is there a way to know the input filename at Hadoop Streaming?
> 
> I am using Hadoop Streaming. The input are multiple files.
> Is there a way to get the current filename in mapper?
> 
> For example:
> $HADOOP_HOME/bin/hadoop  \
> jar $HADOOP_HOME/hadoop-streaming.jar \
>     -input file1 \
>     -input file2 \
>     -output myOutputDir \
>     -mapper mapper \
>     -reducer reducer
> 
> In mapper:
> while (<STDIN>){
>   //how to tell the current line is from file1 or file2?
> }
> 
> 
> 
> 
>       
> 
> 
> 
>       


Mime
View raw message