hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <qwertyman...@gmail.com>
Subject Re: Passing commandline arguments to Mapper class
Date Sun, 03 Oct 2010 04:56:34 GMT
Or you could also set it in code by using the configuration object.
Confuguration.set("my.fourth.arg", args.next()) inside the main() driver
code and then inside the mapper you can use the context object to retrieve
it as so: context.getConf().get("my.fourth.arg") since the conf was
imprinted onto the job submitted.

With the stable API its a matter of overriding the -- void configure
(JobConf job); -- method inside mapper class.

On Oct 3, 2010 5:05 AM, "coder22" <gaur.vbagga@gmail.com> wrote:


I need to pass commandline arguments to Mapper Class.


public class Size extends Configured implements Tool {

 public static class MapClass extends Mapper<Object, Text, Text,
IntWritable> {

   private final static IntWritable one = new IntWritable(1);
   private Text word = new Text();

   public void map(Object key, Text value, Context context) throws
IOException, InterruptedException {


// need to access the Fourth argument passed in commandline

     }
   }
 }


 public static class Reduce extends Reducer<Text, IntWritable, Text,
IntWritable> {

   public void reduce(Text key, Iterable<IntWritable> values, Context
context) throws IOException, InterruptedException {
     //code for reduce class
   }
 }

 static int printUsage() {
   System.out.println("Size [-r <reduces>] <input> <output>  <Size>
   ");
   ToolRunner.printGenericCommandUsage(System.out);
   return -1;
 }

 public int run(String[] args) throws Exception {

   Configuration conf = new Configuration();
   Job job = new Job(conf, "Size program");

   job.setJarByClass(Size.class);
   job.setMapperClass(MapClass.class);
   job.setCombinerClass(Reduce.class);
   job.setReducerClass(Reduce.class);

   // the keys are words (strings)
   job.setOutputKeyClass(Text.class);
   // the values are counts (ints)
   job.setOutputValueClass(IntWritable.class);


   List<String> other_args = new ArrayList<String>();
   for(int i=0; i < args.length; ++i) {
     try {

       if ("-r".equals(args[i])) {
         job.setNumReduceTasks(Integer.parseInt(args[++i]));
       } else {
         other_args.add(args[i]);
       }
     } catch (NumberFormatException except) {
       System.out.println("ERROR: Integer expected instead of " + args[i]);
       return printUsage();
     } catch (ArrayIndexOutOfBoundsException except) {
       System.out.println("ERROR: Required parameter missing from " +
           args[i-1]);
       return printUsage();
     }
   }
   // Make sure there are exactly 3 parameters left.
   if (other_args.size() != 3) {
     System.out.println("ERROR: Wrong number of parameters: " +
other_args.size() + " instead of 3.");
     return printUsage();
   }
   FileInputFormat.addInputPath(job, new Path(other_args.get(0)));
   FileOutputFormat.setOutputPath(job, new Path(other_args.get(1)));

The fourth argument is now in other_args[]
// how do i Pass it from here or from anywhere else


       job.waitForCompletion(true);
   return 0;
 }

 public static void main(String[] args) throws Exception {
   int res = ToolRunner.run(new Configuration(), new Size(), args);
   System.exit(res);
 }
}


I read somewhere to set JobConf... but couldn't understand...can anybody
tell how do i pass the commandline argument to Mapper class

Thanks a lot
--
View this message in context:
http://old.nabble.com/Passing-commandline-arguments-to-Mapper-class-tp29868846p29868846.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message