hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Mapper Only Job, Without Input or Output Path
Date Thu, 15 Mar 2012 21:59:30 GMT
Deepak,

You may read the SleepJob example implementation which does exactly this.

It can be found inside a source tar-ball's
$HADOOP_HOME/src/examples/org/apache/hadoop/examples/SleepJob.java

Or available online at
http://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.1/src/examples/org/apache/hadoop/examples/SleepJob.java

To run it, follow the instructions you get when you run:

$ hadoop jar $HADOOP_HOME/hadoop-examples.jar sleep

On Thu, Mar 15, 2012 at 10:27 PM, Deepak Nettem <deepaknettem@gmail.com> wrote:
> Hi,
>
> I have a use case - I have  files lying on the local disk of every node on
> my cluster. I want to write a Mapper only MapReduce job that reads the file
> off the local disk on every machine, applies some transformation and wrotes
> to HDFS.
>
> Specifically,
>
> 1. The Job shouldn't have any input/output paths, and null key value pairs.
> 2. Mapper Only
> 3. I want to be able to control the number of Mappers, depending on the
> size of my cluster.
>
> What's the best way to do this? I would appreciate any example code.
>
> Deepak



-- 
Harsh J

Mime
View raw message