hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pavan Kulkarni <pavan.babu...@gmail.com>
Subject Re: Setting number of parallel Reducers and Mappers for optimal performance
Date Sat, 11 Aug 2012 04:17:56 GMT
Arun,

  Thanks a lot for your response.

I am running on a 16 core Xeon processor and 12 spindles.So running 12
Mappers with 2G and 6 Reducers with 3G might give me the best
performance.Also is there a general formula to arrive at those numbers?

On Fri, Aug 10, 2012 at 7:34 PM, Arun C Murthy <acm@hortonworks.com> wrote:

> Pavan,
>
> A very important factor is how much CPU and how many spindles you have...
>
> Your proposal for memory (44G in all) seems reasonable.
>
> However, if you have 12 spindles and sufficient CPU I'd do something like
> 10 or 12 maps of 2G each and 6 reduces with 3G/4G each depending on how you
> want to slice/dice your slots.
>
> Arun
>
> On Aug 10, 2012, at 1:24 PM, Pavan Kulkarni wrote:
>
> > Hi,
> >
> > I was trying to optimize Hadoop-1.0.2 performance by setting
> > *mapred.tasktracker.map.tasks.maximum
> > ,**mapred.tasktracker.reduce.tasks.maximum*
> > such that the entire memory is utilized. The tuning of this parameter is
> > given as (CPUS > 2) ? (CPUS * 0.50): 1 for reduce and (CPUS > 2) ? (CPUS
> *
> > 0.75): 1 for map.
> > I didn't quite get how they made this suggestion ?  Isn't the setting
> > dependent on  main memory available?
> > For example I had 48GB of memory and I split the parameters as 32 for
> > mappers and 12 for reducers and remaining 4 for OS and other processes.
> > Please correct me if my assumption is wrong.Also suggest a way to get the
> > optimal performance by setting these parameters. Thanks.
> >
> > --
> >
> > --With Regards
> > Pavan Kulkarni
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>


-- 

--With Regards
Pavan Kulkarni

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message