hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Spreitzer <mspre...@us.ibm.com>
Subject Re: How to make a MapReduce job with no input?
Date Thu, 28 Feb 2013 21:25:04 GMT
On closer inspection, I see that of my two tasks: the first processes 1 
input record and the other processes 0 input records.  So I think this 
solution is correct.  But perhaps it is not the most direct way to get the 
job done?

From:   Mike Spreitzer/Watson/IBM@IBMUS
To:     user@hadoop.apache.org, 
Date:   02/28/2013 04:18 PM
Subject:        How to make a MapReduce job with no input?

I am using the mapred API of Hadoop 1.0.  I want to make a job that does 
not really depend on any input (the job conf supplies all the info needed 
in Mapper).  What is a good way to do this? 

What I have done so far is write a job in which MyMapper.configure(..) 
reads all the real input from the JobConf, and MyMapper.map(..) ignores 
the given key and value, writing the output implied by the JobConf.  I set 
the InputFormat to TextInputFormat and the input paths to be a list of one 
filename; the named file contains one line of text (the word "one"), 
terminated by a newline.  When I run this job (on Linux, hadoop-1.0.0), I 
find it has two map tasks --- one reads the first two bytes of my 
non-input file, and other reads the last two bytes of my non-input file! 
How can I make a job with just one map task? 


View raw message