hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: Adding MapperBase and ReduceBase
Date Fri, 10 Feb 2006 18:28:52 GMT
Owen O'Malley wrote:
> Looking over the examples with Michel's addition of close, I'd like to 
> suggest creating abstract classes MapperBase and ReducerBase that 
> implement Mapper and Reducer interfaces respectively and have empty 
> configure and close methods.
> By providing the default methods, developers will only have to implement 
> map or reduce unless they need the additional functionality.
> Thoughts?

There are cases where I've used a single class to implement both map() 
and reduce().  For these a base class that implements Closeable and 
JobConfigurable would better than a MapperBase and ReducerBase.  It 
could also extend Configured, implementing Configurable.  We might call 
it JobConfigured:

public abstract class JobConfigured
   implements Closeable, JobConfigurable
   extends Configured {

   public JobConfigured() { super(null); }

   public JobConfigured(Configuration conf) { super(conf); }

   public void configure(JobConf conf) { setConf(conf); }

   public void close() {}

Then one can define mapred applications as simply as:

public class MyMapredApplication extends JobConfigured {
   public map(...) { ... };
   public reduce(...) { ... };

   public static void main(String[] args) throws Exception {
     Configuration conf = new Configuration();
     JobConf job = new JobConf(conf, MyMapredApplication.class);

It would be nice to even remove the need for the calls to setMapper() 
and setReducer() above, i.e., to have JobConf default the mapper, 
reducer, etc. to things that are implemented by the class passed to its 


View raw message