hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: How to make a MapReduce job with no input?
Date Fri, 01 Mar 2013 04:15:23 GMT
The default # of map tasks is set to 2 (via mapred.map.tasks from
mapred-default.xml) - which explains your 2-map run for even one line
of text.

For running with no inputs, take a look at Sleep Job's EmptySplits
technique on trunk:
(~line 70)

On Fri, Mar 1, 2013 at 2:46 AM, Mike Spreitzer <mspreitz@us.ibm.com> wrote:
> I am using the mapred API of Hadoop 1.0.  I want to make a job that does not
> really depend on any input (the job conf supplies all the info needed in
> Mapper).  What is a good way to do this?
> What I have done so far is write a job in which MyMapper.configure(..) reads
> all the real input from the JobConf, and MyMapper.map(..) ignores the given
> key and value, writing the output implied by the JobConf.  I set the
> InputFormat to TextInputFormat and the input paths to be a list of one
> filename; the named file contains one line of text (the word "one"),
> terminated by a newline.  When I run this job (on Linux, hadoop-1.0.0), I
> find it has two map tasks --- one reads the first two bytes of my non-input
> file, and other reads the last two bytes of my non-input file!  How can I
> make a job with just one map task?
> Thanks,
> Mike

Harsh J

View raw message