hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrés Durán <du...@tadium.es>
Subject Re: Question about reducers
Date Tue, 22 May 2012 13:58:10 GMT
Many thanks Harsh, I will try it.  :D

Best regards,
	Andrés Durán


El 22/05/2012, a las 14:25, Harsh J escribió:

> A minor correction: CapacityScheduler doesn't seem to do multi-reducer
> assignments (or at least not in 1.x), but does do multi-map
> assignments. This is for the same reason as
> http://search-hadoop.com/m/KYv8JhkOHc1. FairScheduler in 1.x supports
> multi-map and multi-reducer assignments over single heartbeats, which
> should do good on your single 32-task machine.
> 
> Do give it a try and let us know!
> 
> On Tue, May 22, 2012 at 5:51 PM, Harsh J <harsh@cloudera.com> wrote:
>> Hi,
>> 
>> This may be cause, depending on your scheduler, only one Reducer may
>> be allocated per TT heartbeat. A reasoning of why this is the case is
>> explained here: http://search-hadoop.com/m/KYv8JhkOHc1
>> 
>> You may have better results in 1.0.3 using an alternative scheduler
>> such as FairScheduler with multiple-assignments-per-heartbeat turned
>> on (See http://hadoop.apache.org/common/docs/current/fair_scheduler.html
>> and boolean property "mapred.fairscheduler.assignmultiple" to enable)
>> or via CapacityScheduler (See
>> http://hadoop.apache.org/common/docs/current/capacity_scheduler.html)
>> which does it as well (OOB).
>> 
>> On Tue, May 22, 2012 at 5:36 PM, Andrés Durán <duran@tadium.es> wrote:
>>> Hello,
>>> 
>>>        I'm working with a Hadoop, version is 1.0.3 and configured in pseudo-distributed
mode.
>>> 
>>>        I have 128 reducers tasks and it's running in a local machine with 32
cores. The job is working fine and fast it  takes 1 hour and 30 minutes to fininsh. But when
the Job starts, the reducers are comming to the running phase from the tasks queue very slow,
it takes 7 minutes to allocate 32 tasks in the running phase. Why is too slow to allocate
task in running mode? It's possible to adjust any variable in the jobs tracker setup to reduce
this allocation time?
>>> 
>>>  Thanks to all!
>>> 
>>>  Best regards,
>>>        Andrés Durán
>> 
>> 
>> 
>> --
>> Harsh J
> 
> 
> 
> -- 
> Harsh J


Mime
View raw message