hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun C Murthy <...@hortonworks.com>
Subject Re: capacity scheduler
Date Sun, 16 Oct 2011 15:46:09 GMT
You are welcome. *smile*

One of the greatest advantages of open-src s/w is that you can look at the code while scratching
your head in the corner - this way you gain better understanding of the system and we, the
project, will hopefully gain another valuable contributor... hint, hint. ;-)

Good luck.

Arun

On Oct 16, 2011, at 1:27 AM, patrick sang wrote:

> Hi Arun,
> 
> Your answer sheds extra bright light while I am scratching head in the corner.
> 1 million thanks for answer and document. I will post back the result.
> 
> Thanks again,
> P
> 
> On Sat, Oct 15, 2011 at 10:32 PM, Arun C Murthy <acm@hortonworks.com> wrote:
>> 
>> Hi Patrick,
>> 
>> It's hard to diagnose CDH since I don't know what patch-sets they have for the CapacityScheduler
- afaik they only support FairScheduler, but that might have changed.
>> 
>> On Oct 15, 2011, at 4:45 PM, patrick sang wrote:
>> 
>>> 4. from webUI, scheduling  information of orange queue.
>>> 
>>> It said "Used capacity: 12 (100.0% of Capacity)"
>>> while next line said "Maximum capacity: 16 slots"
>>> So what's going on with other 4 slots ? why they are not get used.
>>> 
>>> Is capacity-scheduler supposed to start using extra slots until it hit the
>>> Max capacity ?
>>> (from the variable of
>>> mapred.capacity-scheduler.queue.<queue-name>.maximum-capacity)
>>> (there are no other jobs at all in the cluster)
>>> 
>>> I am really thankful for reading up to this point.
>>> Truly hope someone can shed some light on this.
>>> 
>> 
>> However, if you were using Apache Hadoop 0.20.203 or 0.20.204 (or upcoming 0.20.205
with security + append) you would still see this behaviour because you are hitting 'user limits'
where the CS will not allow a single user to take more than the queue 'configured' capacity
(12 slots here). You will need more than one user in the 'orange' queue  to go over the queue's
capacity. This is to prevent a single user from hogging the system's resources.
>> 
>> If you really want one user to acquire more resources in 'orange' queue, you need
to tweak mapred.capacity-scheduler.queue.orange.user-limit-factor.
>> 
>> More details here:
>> http://hadoop.apache.org/common/docs/stable/capacity_scheduler.html
>> 
>> Arun
>> 


Mime
View raw message