hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohammad Tariq <>
Subject Re: Incresing map reduce tasks will increse the time of the cpu does this seem to be correct
Date Thu, 13 Dec 2012 10:50:32 GMT
Hello Imen,

      If you have huge no of tasks then the overhead of managing the map
and reduce task creation begins to dominate the total job execution time.
Also, more tasks means you need more free cpu slots. If the slots are not
free then the data block of interest will be moved to some other node where
frees lots are available and it will consume time and it is also against
the most basic principle of Hadoop i.e data localization. So, the no. of
maps and reduces should be raised keeping all the factors in mind,
otherwise you may face performance issues.


    Mohammad Tariq

On Thu, Dec 13, 2012 at 4:11 PM, Nitin Pawar <>wrote:

> If the number of maps or reducers your job launched are more than the
> jobqueue/cluster capacity, cpu time will increase
> On Dec 13, 2012 4:02 PM, "imen Megdiche" <> wrote:
>> Hello,
>> I am trying to increase the number of map and reduce tasks for a job and
>> even for the same data size, I noticed that the total time CPU increases but
>> I thought it would decrease. MapReduce is known for performance calculation,
>> but I do not see this when i  do these small tests.
>> What de you thins about this issue ?

View raw message