hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <qwertyman...@gmail.com>
Subject Re: i can't get the file name in map program
Date Fri, 01 Apr 2011 19:52:00 GMT
Hello,

(Inline reply.)

On Fri, Apr 1, 2011 at 8:35 PM, ranjith k <ranjith42k@gmail.com> wrote:
> hello.
> I am new to hadoop map reduce programming. I need to write a map reduce
> program. I have a input folder, it contain a 10 number of documents in text
> format. My aim is to write a map reduce program that read each text file and
> create the word count of each text file separately. My input split is each
> line. The map function is called for each line of text. But i need my file
> name in map function. How can i get the file name to my map function.

This is covered in the docs as part of the Map/Reduce Tutorial itself.
Have a look at the table right below this para-link:
http://hadoop.apache.org/common/docs/r0.20.0/mapred_tutorial.html#Task+JVM+Reuse

> Similarly i need to write the output of each file separately, is it
> possible?

You can achieve some levels of output file-naming using the
MultipleOutputs class.

> My hadoop version is Hadoop 0.20.2.

-- 
Harsh J
http://harshj.com

Mime
View raw message