hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stevens, Keith D." <steven...@llnl.gov>
Subject Re: maprd vs mapreduce api
Date Fri, 05 Aug 2011 22:42:38 GMT
The Mapper and Reducer class in org.apache.hadoop.mapreduce implement the identity function.
 So you should be able to just do 

conf.setMapperClass(org.apache.hadoop.mapreduce.Mapper.class);
conf.setReducerClass(org.apache.hadoop.mapreduce.Reducer.class);

without having to implement your own no-op classes.

I recommend reading the javadoc for differences between the old api and the new api, for example
http://hadoop.apache.org/common/docs/r0.20.2/api/index.html indicates the different functionality
of Mapper in the new api and it's dual use as the identity mapper.

Cheers,
--Keith

On Aug 5, 2011, at 1:15 PM, garpinc wrote:

> 
> I was following this tutorial on version 0.19.1
> 
> http://v-lad.org/Tutorials/Hadoop/23%20-%20create%20the%20project.html
> 
> I however wanted to use the latest version of api 0.20.2
> 
> The original code in tutorial had following lines
> conf.setMapperClass(org.apache.hadoop.mapred.lib.IdentityMapper.class);
> conf.setReducerClass(org.apache.hadoop.mapred.lib.IdentityReducer.class);
> 
> both Identity classes are deprecated.. So seemed the solution was to create
> mapper and reducer as follows:
> public static class NOOPMapper 
>      extends Mapper<Text, IntWritable, Text, IntWritable>{
> 
> 
>   public void map(Text key, IntWritable value, Context context
>                   ) throws IOException, InterruptedException {
> 
>       context.write(key, value);
> 
>   }
> }
> 
> public static class NOOPReducer 
>      extends Reducer<Text,IntWritable,Text,IntWritable> {
>   private IntWritable result = new IntWritable();
> 
>   public void reduce(Text key, Iterable<IntWritable> values, 
>                      Context context
>                      ) throws IOException, InterruptedException {
>     context.write(key, result);
>   }
> }
> 
> 
> And then with code:
> 		Configuration conf = new Configuration();
> 		Job job = new Job(conf, "testdriver");
> 
> 		job.setOutputKeyClass(Text.class);
> 		job.setOutputValueClass(IntWritable.class);
> 
> 		job.setInputFormatClass(TextInputFormat.class);
> 		job.setOutputFormatClass(TextOutputFormat.class);
> 
> 		FileInputFormat.addInputPath(job, new Path("In"));
> 		FileOutputFormat.setOutputPath(job, new Path("Out"));
> 
> 		job.setMapperClass(NOOPMapper.class);
> 		job.setReducerClass(NOOPReducer.class);
> 
> 		job.waitForCompletion(true);
> 
> 
> However I get this message
> java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be
> cast to org.apache.hadoop.io.Text
> 	at TestDriver$NOOPMapper.map(TestDriver.java:1)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> 11/08/01 16:41:01 INFO mapred.JobClient:  map 0% reduce 0%
> 11/08/01 16:41:01 INFO mapred.JobClient: Job complete: job_local_0001
> 11/08/01 16:41:01 INFO mapred.JobClient: Counters: 0
> 
> 
> 
> Can anyone tell me what I need for this to work.
> 
> Attached is full code..
> http://old.nabble.com/file/p32174859/TestDriver.java TestDriver.java 
> -- 
> View this message in context: http://old.nabble.com/maprd-vs-mapreduce-api-tp32174859p32174859.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
> 


Mime
View raw message