hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley" <o...@yahoo-inc.com>
Subject Re: Adding MapperBase and ReduceBase
Date Fri, 10 Feb 2006 23:33:26 GMT

On Feb 10, 2006, at 10:28 AM, Doug Cutting wrote:

> Owen O'Malley wrote:
>> Looking over the examples with Michel's addition of close, I'd like 
>> to suggest creating abstract classes MapperBase and ReducerBase that 
>> implement Mapper and Reducer interfaces respectively and have empty 
>> configure and close methods.
>> By providing the default methods, developers will only have to 
>> implement map or reduce unless they need the additional 
>> functionality.
>> Thoughts?
> There are cases where I've used a single class to implement both map() 
> and reduce().  For these a base class that implements Closeable and 
> JobConfigurable would better than a MapperBase and ReducerBase.  It 
> could also extend Configured, implementing Configurable.  We might 
> call it JobConfigured:

Is the Closable interface useful? How about a little renaming and 
simplifying to do:

public interface UserTask  extends Configurable {
    void close();

public class UserTaskBase implements UserTask extends Configured {
    ... default methods ...

public interface Mapper extends UserTask {
   void map(...);

public interface Reducer extends UserTask {
   void reduce(...);

public class WordCount implements Mapper, Reducer extends UserTaskBase {
   public void map(...)
   public void reduce(...)

   public static void main(...)

When looking through the code, the auto configuration in 
JobConf.newInstance is pretty confusing. Reading through the code, it 
looks like the Reducer objects are configured twice.

> It would be nice to even remove the need for the calls to setMapper() 
> and setReducer() above, i.e., to have JobConf default the mapper, 
> reducer, etc. to things that are implemented by the class passed to 
> its constructor.

Which constructor is doing this? The JobConfigured? I'd be worried 
about the different contexts that the JobConfs are created in. In 
particular, the only place they could meaningfully be set is the 
JobConf in the driver process, which doesn't have any Mapper or Reducer 
objects instantiated.

-- Owen

View raw message