hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Deepak Nettem <deepaknet...@gmail.com>
Subject Mapper Only Job, Without Input or Output Path
Date Thu, 15 Mar 2012 16:57:50 GMT
Hi,

I have a use case - I have  files lying on the local disk of every node on
my cluster. I want to write a Mapper only MapReduce job that reads the file
off the local disk on every machine, applies some transformation and wrotes
to HDFS.

Specifically,

1. The Job shouldn't have any input/output paths, and null key value pairs.
2. Mapper Only
3. I want to be able to control the number of Mappers, depending on the
size of my cluster.

What's the best way to do this? I would appreciate any example code.

Deepak

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message