hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Laxman Ch <laxman....@gmail.com>
Subject Re: Concurrency control
Date Thu, 01 Oct 2015 09:33:40 GMT
Hi Naga,

Like most of the app-level configurations, admin can configure the defaults
which user may want override at application level.

If this is at queue-level then all applications in a queue will have the
same limits. But all our applications in a queue may not have same SLA and
we may need to restrict them differently. This requires again splitting
queues further which I feel is more overhead.


On 30 September 2015 at 09:00, Naganarasimha G R (Naga) <
garlanaganarasimha@huawei.com> wrote:

> Hi Laxman,
>
> Ideally i understand it would be better its available @ application level,
> but  its like each user is expected to ensure that he gives the right
> configuration which is within the limits of max capacity.
> And what if user submits some app *(kind of a query execution app**)*
> with out this setting *or* he doesn't know how much it should take ? In
> general, users specifying resources for containers itself is a difficult
> task.
> And it might not be right to expect that the admin will do it for each
> application in the queue either.  Basically governing will be difficult if
> its not enforced from queue/scheduler side.
>
> + Naga
>
> ------------------------------
> *From:* Laxman Ch [laxman.lux@gmail.com]
> *Sent:* Tuesday, September 29, 2015 16:52
>
> *To:* user@hadoop.apache.org
> *Subject:* Re: Concurrency control
>
> IMO, its better to have a application level configuration than to have a
> scheduler/queue level configuration.
> Having a queue level configuration will restrict every single application
> that runs in that queue.
> But, we may want to configure these limits for only some set of jobs and
> also for every application these limits can be different.
>
> FairOrdering policy thing, order of jobs can't be enforced as these are
> adhoc jobs and scheduled/owned independently by different teams.
>
> On 29 September 2015 at 16:43, Naganarasimha G R (Naga) <
> garlanaganarasimha@huawei.com> wrote:
>
>> Hi Laxman,
>>
>> What i meant was,  suppose if we support and configure
>> yarn.scheduler.capacity.<queue-path>.app-limit-factor to .25  then a
>> single app should not take more than 25 % of resources in the queue.
>> This would be a more generic configuration which can be enforced by the
>> admin, than expecting it to be configured for per app by the user.
>>
>> And for Rohith's suggestion of FairOrdering policy , I think it should
>> solve the problem if the App which is submitted first is not already hogged
>> all the queue's resources.
>>
>> + Naga
>>
>> ------------------------------
>> *From:* Laxman Ch [laxman.lux@gmail.com]
>> *Sent:* Tuesday, September 29, 2015 16:03
>>
>> *To:* user@hadoop.apache.org
>> *Subject:* Re: Concurrency control
>>
>> Thanks Rohit, Naga and Lloyd for the responses.
>>
>> > I think Laxman should also tell us more about which application type he
>> is running.
>>
>> We run mr jobs mostly with default core/memory allocation (1 vcore,
>> 1.5GB).
>> Our problem is more about controlling the * resources used
>> simultaneously by all running containers *at any given point of time per
>> application.
>>
>> Example:
>> 1. App1 and App2 are two MR apps.
>> 2. App1 and App2 belong to same queue (capacity: 100 vcores, 150 GB).
>> 3. Each App1 task takes 8 hrs for completion
>> 4. Each App2 task takes 5 mins for completion
>> 5. App1 triggered at time "t1" and using all the slots of queue.
>> 6. App2 triggered at time "t2" (where t2 > t1) and waits longer fot App1
>> tasks to release the resources.
>> 7. We can't have preemption enabled as we don't want to lose the work
>> completed so far by App1.
>> 8. We can't have separate queues for App1 and App2 as we have lots of
>> jobs like this and it will explode the number of queues.
>> 9. We use CapacityScheduler.
>>
>> In this scenario, if I can control App1 concurrent usage limits to
>> 50vcores and 75GB, then App1 may take longer time to finish but there won't
>> be any starvation for App2 (and other jobs running in same queue)
>>
>> @Rohit, FairOrdering policy may not solve this starvation problem.
>>
>> @Naga, I couldn't think through the expected behavior of "
>> yarn.scheduler.capacity.<queue-path>.app-limit-factor"
>> I will revert on this.
>>
>> On 29 September 2015 at 14:57, Namikaze Minato <lloydsensei@gmail.com>
>> wrote:
>>
>>> I think Laxman should also tell us more about which application type
>>> he is running. The normal use cas of MAPREDUCE should be working as
>>> intended, but if he has for example one MAP using 100 vcores, then the
>>> second map will have to wait until the app completes. Same would
>>> happen if the applications running were spark, as spark does not free
>>> what is allocated to it.
>>>
>>> Regards,
>>> LLoyd
>>>
>>> On 29 September 2015 at 11:22, Naganarasimha G R (Naga)
>>> <garlanaganarasimha@huawei.com> wrote:
>>> > Thanks Rohith for your thoughts ,
>>> >       But i think by this configuration it might not completely solve
>>> the
>>> > scenario mentioned by Laxman, As if the there is some time gap between
>>> first
>>> > and and the second app then though we have fairness or priority set
>>> for apps
>>> > starvation will be there.
>>> > IIUC we can think of an approach where in we can have something
>>> similar to
>>> > "yarn.scheduler.capacity.<queue-path>.user-limit-factor"  where in
it
>>> can
>>> > provide  the functionality like
>>> > "yarn.scheduler.capacity.<queue-path>.app-limit-factor" : The multiple
>>> of
>>> > the queue capacity which can be configured to allow a single app to
>>> acquire
>>> > more resources.  Thoughts ?
>>> >
>>> > + Naga
>>> >
>>> >
>>> >
>>> > ________________________________
>>> > From: Rohith Sharma K S [rohithsharmaks@huawei.com]
>>> > Sent: Tuesday, September 29, 2015 14:07
>>> > To: user@hadoop.apache.org
>>> > Subject: RE: Concurrency control
>>> >
>>> > Hi Laxman,
>>> >
>>> >
>>> >
>>> > In Hadoop-2.8(Not released  yet),  CapacityScheduler provides
>>> configuration
>>> > for configuring ordering policy.  By configuring FAIR_ORDERING_POLICY
>>> in CS
>>> > , probably you should be able to achieve  your goal i.e avoiding
>>> starving of
>>> > applications for resources.
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.policy.FairOrderingPolicy<S>
>>> >
>>> > An OrderingPolicy which orders SchedulableEntities for fairness (see
>>> > FairScheduler FairSharePolicy), generally, processes with lesser usage
>>> are
>>> > lesser. If sizedBasedWeight is set to true then an application with
>>> high
>>> > demand may be prioritized ahead of an application with less usage.
>>> This is
>>> > to offset the tendency to favor small apps, which could result in
>>> starvation
>>> > for large apps if many small ones enter and leave the queue
>>> continuously
>>> > (optional, default false)
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > Community Issue Id :  https://issues.apache.org/jira/browse/YARN-3463
>>> >
>>> >
>>> >
>>> > Thanks & Regards
>>> >
>>> > Rohith Sharma K S
>>> >
>>> >
>>> >
>>> > From: Laxman Ch [mailto:laxman.lux@gmail.com]
>>> > Sent: 29 September 2015 13:36
>>> > To: user@hadoop.apache.org
>>> > Subject: Re: Concurrency control
>>> >
>>> >
>>> >
>>> > Bouncing this thread again. Any other thoughts please?
>>> >
>>> >
>>> >
>>> > On 17 September 2015 at 23:21, Laxman Ch <laxman.lux@gmail.com> wrote:
>>> >
>>> > No Naga. That wont help.
>>> >
>>> >
>>> >
>>> > I am running two applications (app1 - 100 vcores, app2 - 100 vcores)
>>> with
>>> > same user which runs in same queue (capacity=100vcores). In this
>>> scenario,
>>> > if app1 triggers first occupies all the slots and runs longs then app2
>>> will
>>> > starve longer.
>>> >
>>> >
>>> >
>>> > Let me reiterate my problem statement. I wanted "to control the amount
>>> of
>>> > resources (vcores, memory) used by an application SIMULTANEOUSLY"
>>> >
>>> >
>>> >
>>> > On 17 September 2015 at 22:28, Naganarasimha Garla
>>> > <naganarasimha.gr@gmail.com> wrote:
>>> >
>>> > Hi Laxman,
>>> >
>>> > For the example you have stated may be we can do the following things :
>>> >
>>> > 1. Create/modify the queue with capacity and max cap set such that its
>>> > equivalent to 100 vcores. So as there is no elasticity, given
>>> application
>>> > will not be using the resources beyond the capacity configured
>>> >
>>> > 2. yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent
>>>  so that
>>> > each active user would be assured with the minimum guaranteed
>>> resources . By
>>> > default value is 100 implies no user limits are imposed.
>>> >
>>> >
>>> >
>>> > Additionally we can think of
>>> >
>>> "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage"
>>> > which will enforce strict cpu usage for a given container if required.
>>> >
>>> >
>>> >
>>> > + Naga
>>> >
>>> >
>>> >
>>> > On Thu, Sep 17, 2015 at 4:42 PM, Laxman Ch <laxman.lux@gmail.com>
>>> wrote:
>>> >
>>> > Yes. I'm already using cgroups. Cgroups helps in controlling the
>>> resources
>>> > at container level. But my requirement is more about controlling the
>>> > concurrent resource usage of an application at whole cluster level.
>>> >
>>> >
>>> >
>>> > And yes, we do configure queues properly. But, that won't help.
>>> >
>>> >
>>> >
>>> > For example, I have an application with a requirement of 1000 vcores.
>>> But, I
>>> > wanted to control this application not to go beyond 100 vcores at any
>>> point
>>> > of time in the cluster/queue. This makes that application to run
>>> longer even
>>> > when my cluster is free but I will be able meet the guaranteed SLAs of
>>> other
>>> > applications.
>>> >
>>> >
>>> >
>>> > Hope this helps to understand my question.
>>> >
>>> >
>>> >
>>> > And thanks Narasimha for quick response.
>>> >
>>> >
>>> >
>>> > On 17 September 2015 at 16:17, Naganarasimha Garla
>>> > <naganarasimha.gr@gmail.com> wrote:
>>> >
>>> > Hi Laxman,
>>> >
>>> > Yes if cgroups are enabled and
>>> "yarn.scheduler.capacity.resource-calculator"
>>> > configured to DominantResourceCalculator then cpu and memory can be
>>> > controlled.
>>> >
>>> > Please Kindly  furhter refer to the official documentation
>>> > http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html
>>> >
>>> >
>>> >
>>> > But may be if say more about problem then we can suggest ideal
>>> > configuration, seems like capacity configuration and splitting of the
>>> queue
>>> > is not rightly done or you might refer to Fair Scheduler if you want
>>> more
>>> > fairness for container allocation for different apps.
>>> >
>>> >
>>> >
>>> > On Thu, Sep 17, 2015 at 4:10 PM, Laxman Ch <laxman.lux@gmail.com>
>>> wrote:
>>> >
>>> > Hi,
>>> >
>>> >
>>> >
>>> > In YARN, do we have any way to control the amount of resources (vcores,
>>> > memory) used by an application SIMULTANEOUSLY.
>>> >
>>> >
>>> >
>>> > - In my cluster, noticed some large and long running mr-app occupied
>>> all the
>>> > slots of the queue and blocking other apps to get started.
>>> >
>>> > - I'm using Capacity schedulers (using hierarchical queues and
>>> preemption
>>> > disabled)
>>> >
>>> > - Using Hadoop version 2.6.0
>>> >
>>> > - Did some googling around this and gone through configuration docs
>>> but I'm
>>> > not able to find anything that matches my requirement.
>>> >
>>> >
>>> >
>>> > If needed, I can provide more details on the usecase and problem.
>>> >
>>> >
>>> >
>>> > --
>>> >
>>> > Thanks,
>>> > Laxman
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> >
>>> > Thanks,
>>> > Laxman
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> >
>>> > Thanks,
>>> > Laxman
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> >
>>> > Thanks,
>>> > Laxman
>>>
>>
>>
>>
>> --
>> Thanks,
>> Laxman
>>
>
>
>
> --
> Thanks,
> Laxman
>



-- 
Thanks,
Laxman

Mime
View raw message