hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nitin Pawar <nitinpawar...@gmail.com>
Subject Re: Ideal number of mappers and reducers to increase performance
Date Fri, 01 Aug 2014 10:43:54 GMT
the setting mapred.tasktracker.*  related settings are related to maximum
number of maps or reducers a tasktracker can run. This can change across
machines if you have multiple nodes then depending on machine config you
can decide these values. If you set it to 4, it will basically mean that at
any given point the tasktracker running on that machine will run maximum of
4 maps or reducers.

mapred.map.* settings are cluster wide settings. These setting mean that by
default how many tasks (maps or reducers) per job should be configured by
default. These settings are overwritten by the job when its submitted to
jobtracker or by the client itself.

Its not must for you to set the mapred.map.tasks or mapred.reduce.tasks as
the default value for it is 2 in config.




On Fri, Aug 1, 2014 at 4:06 PM, sindhu hosamane <sindhuht@gmail.com> wrote:

> Thanks a ton  for ur help Harsh . I am a newbie in hadoop.
> If i have set
> mapred.tasktracker.map.tasks.maximum  = 4
> mapred.tasktracker.reduce.tasks.maximum = 4
> Should i also bother or set below values
> mapred.map.Tasks and mapred.reduce.Tasks .
> If yes then what is the ideal value?
>
>
>
>
>
> On Fri, Aug 1, 2014 at 12:00 AM, Harsh J <harsh@cloudera.com> wrote:
>
>> You can perhaps start with a generic 4+4 configuration (which matches
>> your cores), and tune your way upwards or downwards from there based
>> on your results.
>>
>> On Thu, Jul 31, 2014 at 8:35 PM, Sindhu Hosamane <sindhuht@gmail.com>
>> wrote:
>> > Hello friends ,
>> >
>> > If i am running my experiment on a server with 2 processors (4 cores
>> each ) .
>> > To say it has 2 processors and 8 cores .
>> > What would be the ideal values for mapred.tasktracker.map.tasks.maximum
>>  and mapred.tasktracker.reduce.tasks.maximum to get maximum performance.
>> > I am running cascalog queries on data of size 280 MB.
>> > I have multiple datanodes running on same machine.
>> >
>> > Your help is very much appreciated.
>> >
>> >
>> > Regards,
>> > sindhu
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>
>


-- 
Nitin Pawar

Mime
View raw message