hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: Adding MapperBase and ReduceBase
Date Mon, 13 Feb 2006 19:31:01 GMT
Owen O'Malley wrote:
> Is the Closable interface useful? How about a little renaming and 
> simplifying to do:
> 
> public interface UserTask  extends Configurable {
>    void close();
> }
> 
> public class UserTaskBase implements UserTask extends Configured {
>    ... default methods ...
> }
> 
> public interface Mapper extends UserTask {
>   void map(...);
> }
> 
> public interface Reducer extends UserTask {
>   void reduce(...);
> }
> 
> public class WordCount implements Mapper, Reducer extends UserTaskBase {
>   public void map(...)
>   public void reduce(...)
> 
>   public static void main(...)
> }

I like this.

> When looking through the code, the auto configuration in 
> JobConf.newInstance is pretty confusing. Reading through the code, it 
> looks like the Reducer objects are configured twice.

What's confusing?  The method in SVN looks simple to me.  HADOOP-29 
would make this more complicated, but I'm not convinced that is 
required.  Can you elaborate?

>> It would be nice to even remove the need for the calls to setMapper() 
>> and setReducer() above, i.e., to have JobConf default the mapper, 
>> reducer, etc. to things that are implemented by the class passed to 
>> its constructor.
> 
> Which constructor is doing this? The JobConfigured?

No, the JobConf.  One could construct a JobConf, as before, with 
JobConf(Configuration,UserTask), but, in addition to using the UserTask 
to determine the default jar, it could also use the UserTask to 
determine the default mapper, reducer, etc.

So a user application could be written as simply as:

public class MyApp implements Mapper, Reducer extends UserTaskBase {
   public map(...) { ... };
   public reduce(...) { ... };

   public static void main(String[] args) throws Exception {
     Configuration conf = new Configuration();
     JobConf job = new JobConf(conf, MyApp.class);
     job.setInputDir(args[0]);
     job.setOutputDir(args[1]);
     JobClient.run(job);
   }
}

Does that make sense?

Doug

Mime
View raw message