hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <ser...@gmail.com>
Subject Re: Control the number of Mappers
Date Thu, 25 Nov 2010 20:01:44 GMT
I wasn't talking about how to configure the cluster to not invoke more than
a certain # of Mappers simultaneously. Instead, I'd like to configure a
(certain) job to invoke exactly N Mappers, where N is the number of cores in
the cluster. Irregardless of the size of the data. This is not critical if
it can't be done, but it can improve the performance of my job if it can be
done.

Thanks
Shai

On Thu, Nov 25, 2010 at 9:55 PM, Niels Basjes <Niels@basjes.nl> wrote:

> Hi,
>
> 2010/11/25 Shai Erera <serera@gmail.com>:
> > Is there a way to make MapReduce create exactly N Mappers? More
> > specifically, if say my data can be split to 200 Mappers, and I have only
> > 100 cores, how can I ensure only 100 Mappers will be created? The number
> of
> > cores is not something I know in advance, so writing a special
> InputFormat
> > might be tricky, unless I can query Hadoop for the available # of cores
> (in
> > the entire cluster).
>
> You can configure on a node by node basis how many map and reduce
> tasks can be started by the task tracker on that node.
> This is done via the conf/mapred-site.xml using these two settings:
> mapred.tasktracker.{map|reduce}.tasks.maximum
>
> Have a look at this page for more information
> http://hadoop.apache.org/common/docs/current/cluster_setup.html
>
> --
> Met vriendelijke groeten,
>
> Niels Basjes
>

Mime
View raw message