hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Owen O'Malley <omal...@apache.org>
Subject Re: Ideas for dynamic change reducer task number ?
Date Mon, 23 Nov 2009 17:19:53 GMT

On Nov 22, 2009, at 4:48 PM, Jeff Zhang wrote:

> My concern is that it is just like hard code to use  
> conf.setNumReduceTasks
> on the configuration. It is not flexible, so my idea is that adding an
> interface to change the reducer number dynamically according the  
> different
> size of input data set.

You misunderstand. I meant doing something like:

public class MyInputFormat ....

   public InputSplit[] getSplits(JobConf conf) {
      InputSplit[] result = ...;
      // compute total size of input
      conf.setNumReduceTasks(max(6, size / 10G));
   }
}

I haven't checked the code to make sure it will work, but I believe it  
will.

-- Owen

Mime
View raw message