hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Mapper Only Job, Without Input or Output Path
Date Thu, 15 Mar 2012 21:59:30 GMT

You may read the SleepJob example implementation which does exactly this.

It can be found inside a source tar-ball's

Or available online at

To run it, follow the instructions you get when you run:

$ hadoop jar $HADOOP_HOME/hadoop-examples.jar sleep

On Thu, Mar 15, 2012 at 10:27 PM, Deepak Nettem <deepaknettem@gmail.com> wrote:
> Hi,
> I have a use case - I have  files lying on the local disk of every node on
> my cluster. I want to write a Mapper only MapReduce job that reads the file
> off the local disk on every machine, applies some transformation and wrotes
> to HDFS.
> Specifically,
> 1. The Job shouldn't have any input/output paths, and null key value pairs.
> 2. Mapper Only
> 3. I want to be able to control the number of Mappers, depending on the
> size of my cluster.
> What's the best way to do this? I would appreciate any example code.
> Deepak

Harsh J

View raw message