hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shouguo Li <the1plum...@gmail.com>
Subject Re: capacity scheduler
Date Thu, 20 Oct 2011 00:34:53 GMT
good thread. cleared some of my confusions about scheduler as well, ;)
thx!


On Wed, Oct 19, 2011 at 5:03 PM, patrick sang <silvianhadoop@gmail.com>wrote:

> >> However, if you were using Apache Hadoop 0.20.203 or 0.20.204 (or
> upcoming 0.20.205 with security + append) you would still see this behaviour
> because you are hitting 'user >>limits' where the CS will not allow a single
> user to take more than the queue 'configured' capacity (12 slots here). You
> will need more than one user in the 'orange' queue  to go over >>the queue's
> capacity. This is to prevent a single user from hogging the system's
> resources.
>
> >> If you really want one user to acquire more resources in 'orange' queue,
> you need to tweak mapred.capacity-scheduler.queue.orange.user-limit-factor.
>
> Arun, you're the man!!!
> It is exactly solve my issue.
> submitting jobs by another user allowed the queue burst pass the capacity.
> In my settings, at this point we have only one user for all which
> definitely user-limit-factor does work!!
>
> -------------
> Map tasks
> Capacity: 12 slots
> Maximum capacity: 32 slots
> Used capacity: 16 (133.3% of Capacity) <------
> Running tasks: 16
> Active users:
> User 'apps': 16 (100.0% of used capacity)
> -------------
>
> This is the configuration for orange queue.
> <!-- Queue: orange -->
>  <property>
>    <name>mapred.capacity-scheduler.queue.orange.capacity</name>
>    <value>40</value>
>  </property>
>  <property>
>    <name>mapred.capacity-scheduler.queue.orange.maximum-capacity</name>
>     <value>100</value>
>   </property>
>  <property>
>    <name>mapred.capacity-scheduler.queue.orange.supports-priority</name>
>    <value>true</value>
>  </property>
>   <property>
>    <name>mapred.capacity-scheduler.queue.orange.user-limit-factor</name>
>    <value>2</value>
>  </property>
>
> ---------------------------------------------------
>
> in CDH3u0, it supports CS, but
>
> One interesting and sad part that i want to mention here.
>
> This is the link that I followed from cdh web site.
>
>
> http://archive.cloudera.com/cdh/3/hadoop-0.20.2-cdh3u0/capacity_scheduler.html
>
> it doesn't mention about user-limit-factor in the page at all.
>
>
> >>  this way you gain better understanding of the system and we, the
> project, will hopefully gain another valuable contributor... hint, hint. ;-)
> ;-).. got the hint.
> As unix sysadmin, pretty much 0 on java coding...lol but not 0 in php/perl;
> what i can do to contribute... how can i start ..?
>
> Cheers,
> -P
>
>
>
> On Sun, Oct 16, 2011 at 8:46 AM, Arun C Murthy <acm@hortonworks.com>
> wrote:
> > You are welcome. *smile*
> >
> > One of the greatest advantages of open-src s/w is that you can look at
> the code while scratching your head in the corner - this way you gain better
> understanding of the system and we, the project, will hopefully gain another
> valuable contributor... hint, hint. ;-)
> >
> > Good luck.
> >
> > Arun
> >
> > On Oct 16, 2011, at 1:27 AM, patrick sang wrote:
> >
> >> Hi Arun,
> >>
> >> Your answer sheds extra bright light while I am scratching head in the
> corner.
> >> 1 million thanks for answer and document. I will post back the result.
> >>
> >> Thanks again,
> >> P
> >>
> >> On Sat, Oct 15, 2011 at 10:32 PM, Arun C Murthy <acm@hortonworks.com>
> wrote:
> >>>
> >>> Hi Patrick,
> >>>
> >>> It's hard to diagnose CDH since I don't know what patch-sets they have
> for the CapacityScheduler - afaik they only support FairScheduler, but that
> might have changed.
> >>>
> >>> On Oct 15, 2011, at 4:45 PM, patrick sang wrote:
> >>>
> >>>> 4. from webUI, scheduling  information of orange queue.
> >>>>
> >>>> It said "Used capacity: 12 (100.0% of Capacity)"
> >>>> while next line said "Maximum capacity: 16 slots"
> >>>> So what's going on with other 4 slots ? why they are not get used.
> >>>>
> >>>> Is capacity-scheduler supposed to start using extra slots until it hit
> the
> >>>> Max capacity ?
> >>>> (from the variable of
> >>>> mapred.capacity-scheduler.queue.<queue-name>.maximum-capacity)
> >>>> (there are no other jobs at all in the cluster)
> >>>>
> >>>> I am really thankful for reading up to this point.
> >>>> Truly hope someone can shed some light on this.
> >>>>
> >>>
> >>> However, if you were using Apache Hadoop 0.20.203 or 0.20.204 (or
> upcoming 0.20.205 with security + append) you would still see this behaviour
> because you are hitting 'user limits' where the CS will not allow a single
> user to take more than the queue 'configured' capacity (12 slots here). You
> will need more than one user in the 'orange' queue  to go over the queue's
> capacity. This is to prevent a single user from hogging the system's
> resources.
> >>>
> >>> If you really want one user to acquire more resources in 'orange'
> queue, you need to tweak
> mapred.capacity-scheduler.queue.orange.user-limit-factor.
> >>>
> >>> More details here:
> >>> http://hadoop.apache.org/common/docs/stable/capacity_scheduler.html
> >>>
> >>> Arun
> >>>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message