hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun C Murthy <...@hortonworks.com>
Subject Re: Setting number of parallel Reducers and Mappers for optimal performance
Date Sun, 12 Aug 2012 04:37:22 GMT
Pavan,
 
On Aug 10, 2012, at 9:17 PM, Pavan Kulkarni wrote:

> Arun,
> 
>  Thanks a lot for your response.
> 
> I am running on a 16 core Xeon processor and 12 spindles.So running 12
> Mappers with 2G and 6 Reducers with 3G might give me the best
> performance.

Hmm... ok. You actually _may_ have enough CPU to drive slightly higher number of tasks. 
You could also measure that against 16 maps with 1.5G each and 6 reduces with 3G each.

> Also is there a general formula to arrive at those numbers?
> 

 I'll think about it, but as with all systems, Hadoop performance is some parts experience,
some knowledge of workload, black-magic and simple measurement... *smile*

Arun

> On Fri, Aug 10, 2012 at 7:34 PM, Arun C Murthy <acm@hortonworks.com> wrote:
> 
>> Pavan,
>> 
>> A very important factor is how much CPU and how many spindles you have...
>> 
>> Your proposal for memory (44G in all) seems reasonable.
>> 
>> However, if you have 12 spindles and sufficient CPU I'd do something like
>> 10 or 12 maps of 2G each and 6 reduces with 3G/4G each depending on how you
>> want to slice/dice your slots.
>> 
>> Arun
>> 
>> On Aug 10, 2012, at 1:24 PM, Pavan Kulkarni wrote:
>> 
>>> Hi,
>>> 
>>> I was trying to optimize Hadoop-1.0.2 performance by setting
>>> *mapred.tasktracker.map.tasks.maximum
>>> ,**mapred.tasktracker.reduce.tasks.maximum*
>>> such that the entire memory is utilized. The tuning of this parameter is
>>> given as (CPUS > 2) ? (CPUS * 0.50): 1 for reduce and (CPUS > 2) ? (CPUS
>> *
>>> 0.75): 1 for map.
>>> I didn't quite get how they made this suggestion ?  Isn't the setting
>>> dependent on  main memory available?
>>> For example I had 48GB of memory and I split the parameters as 32 for
>>> mappers and 12 for reducers and remaining 4 for OS and other processes.
>>> Please correct me if my assumption is wrong.Also suggest a way to get the
>>> optimal performance by setting these parameters. Thanks.
>>> 
>>> --
>>> 
>>> --With Regards
>>> Pavan Kulkarni
>> 
>> --
>> Arun C. Murthy
>> Hortonworks Inc.
>> http://hortonworks.com/
>> 
>> 
>> 
> 
> 
> -- 
> 
> --With Regards
> Pavan Kulkarni

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message