hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Laxman Ch <laxman....@gmail.com>
Subject Re: Concurrency control
Date Fri, 02 Oct 2015 16:56:41 GMT
Thanks and Perfect Harsh. Exactly what I am looking for. Most of our
applications are MR.
So, this should be sufficient for us. These configurations, I will give a
try and post my findings again here. Thanks again.

Thanks Naga, Rohit & Lloyd for your suggestions and discussion.

On 2 October 2015 at 07:37, Harsh J <harsh@cloudera.com> wrote:

> If all your Apps are MR, then what you are looking for is MAPREDUCE-5583
> (it can be set per-job).
>
> On Thu, Oct 1, 2015 at 3:03 PM Laxman Ch <laxman.lux@gmail.com> wrote:
>
>> Hi Naga,
>>
>> Like most of the app-level configurations, admin can configure the
>> defaults which user may want override at application level.
>>
>> If this is at queue-level then all applications in a queue will have the
>> same limits. But all our applications in a queue may not have same SLA and
>> we may need to restrict them differently. This requires again splitting
>> queues further which I feel is more overhead.
>>
>>
>> On 30 September 2015 at 09:00, Naganarasimha G R (Naga) <
>> garlanaganarasimha@huawei.com> wrote:
>>
>>> Hi Laxman,
>>>
>>> Ideally i understand it would be better its available @ application
>>> level, but  its like each user is expected to ensure that he gives the
>>> right configuration which is within the limits of max capacity.
>>> And what if user submits some app *(kind of a query execution app**)*
>>> with out this setting *or* he doesn't know how much it should take ? In
>>> general, users specifying resources for containers itself is a difficult
>>> task.
>>> And it might not be right to expect that the admin will do it for each
>>> application in the queue either.  Basically governing will be difficult if
>>> its not enforced from queue/scheduler side.
>>>
>>> + Naga
>>>
>>> ------------------------------
>>> *From:* Laxman Ch [laxman.lux@gmail.com]
>>> *Sent:* Tuesday, September 29, 2015 16:52
>>>
>>> *To:* user@hadoop.apache.org
>>> *Subject:* Re: Concurrency control
>>>
>>> IMO, its better to have a application level configuration than to have a
>>> scheduler/queue level configuration.
>>> Having a queue level configuration will restrict every single
>>> application that runs in that queue.
>>> But, we may want to configure these limits for only some set of jobs and
>>> also for every application these limits can be different.
>>>
>>> FairOrdering policy thing, order of jobs can't be enforced as these are
>>> adhoc jobs and scheduled/owned independently by different teams.
>>>
>>> On 29 September 2015 at 16:43, Naganarasimha G R (Naga) <
>>> garlanaganarasimha@huawei.com> wrote:
>>>
>>>> Hi Laxman,
>>>>
>>>> What i meant was,  suppose if we support and configure
>>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor to .25  then
a
>>>> single app should not take more than 25 % of resources in the queue.
>>>> This would be a more generic configuration which can be enforced by the
>>>> admin, than expecting it to be configured for per app by the user.
>>>>
>>>> And for Rohith's suggestion of FairOrdering policy , I think it should
>>>> solve the problem if the App which is submitted first is not already hogged
>>>> all the queue's resources.
>>>>
>>>> + Naga
>>>>
>>>> ------------------------------
>>>> *From:* Laxman Ch [laxman.lux@gmail.com]
>>>> *Sent:* Tuesday, September 29, 2015 16:03
>>>>
>>>> *To:* user@hadoop.apache.org
>>>> *Subject:* Re: Concurrency control
>>>>
>>>> Thanks Rohit, Naga and Lloyd for the responses.
>>>>
>>>> > I think Laxman should also tell us more about which application type
he
>>>> is running.
>>>>
>>>> We run mr jobs mostly with default core/memory allocation (1 vcore,
>>>> 1.5GB).
>>>> Our problem is more about controlling the * resources used
>>>> simultaneously by all running containers *at any given point of time
>>>> per application.
>>>>
>>>> Example:
>>>> 1. App1 and App2 are two MR apps.
>>>> 2. App1 and App2 belong to same queue (capacity: 100 vcores, 150 GB).
>>>> 3. Each App1 task takes 8 hrs for completion
>>>> 4. Each App2 task takes 5 mins for completion
>>>> 5. App1 triggered at time "t1" and using all the slots of queue.
>>>> 6. App2 triggered at time "t2" (where t2 > t1) and waits longer fot
>>>> App1 tasks to release the resources.
>>>> 7. We can't have preemption enabled as we don't want to lose the work
>>>> completed so far by App1.
>>>> 8. We can't have separate queues for App1 and App2 as we have lots of
>>>> jobs like this and it will explode the number of queues.
>>>> 9. We use CapacityScheduler.
>>>>
>>>> In this scenario, if I can control App1 concurrent usage limits to
>>>> 50vcores and 75GB, then App1 may take longer time to finish but there won't
>>>> be any starvation for App2 (and other jobs running in same queue)
>>>>
>>>> @Rohit, FairOrdering policy may not solve this starvation problem.
>>>>
>>>> @Naga, I couldn't think through the expected behavior of "
>>>> yarn.scheduler.capacity.<queue-path>.app-limit-factor"
>>>> I will revert on this.
>>>>
>>>> On 29 September 2015 at 14:57, Namikaze Minato <lloydsensei@gmail.com>
>>>> wrote:
>>>>
>>>>> I think Laxman should also tell us more about which application type
>>>>> he is running. The normal use cas of MAPREDUCE should be working as
>>>>> intended, but if he has for example one MAP using 100 vcores, then the
>>>>> second map will have to wait until the app completes. Same would
>>>>> happen if the applications running were spark, as spark does not free
>>>>> what is allocated to it.
>>>>>
>>>>> Regards,
>>>>> LLoyd
>>>>>
>>>>> On 29 September 2015 at 11:22, Naganarasimha G R (Naga)
>>>>> <garlanaganarasimha@huawei.com> wrote:
>>>>> > Thanks Rohith for your thoughts ,
>>>>> >       But i think by this configuration it might not completely
>>>>> solve the
>>>>> > scenario mentioned by Laxman, As if the there is some time gap
>>>>> between first
>>>>> > and and the second app then though we have fairness or priority
set
>>>>> for apps
>>>>> > starvation will be there.
>>>>> > IIUC we can think of an approach where in we can have something
>>>>> similar to
>>>>> > "yarn.scheduler.capacity.<queue-path>.user-limit-factor" 
where in
>>>>> it can
>>>>> > provide  the functionality like
>>>>> > "yarn.scheduler.capacity.<queue-path>.app-limit-factor" :
The
>>>>> multiple of
>>>>> > the queue capacity which can be configured to allow a single app
to
>>>>> acquire
>>>>> > more resources.  Thoughts ?
>>>>> >
>>>>> > + Naga
>>>>> >
>>>>> >
>>>>> >
>>>>> > ________________________________
>>>>> > From: Rohith Sharma K S [rohithsharmaks@huawei.com]
>>>>> > Sent: Tuesday, September 29, 2015 14:07
>>>>> > To: user@hadoop.apache.org
>>>>> > Subject: RE: Concurrency control
>>>>> >
>>>>> > Hi Laxman,
>>>>> >
>>>>> >
>>>>> >
>>>>> > In Hadoop-2.8(Not released  yet),  CapacityScheduler provides
>>>>> configuration
>>>>> > for configuring ordering policy.  By configuring
>>>>> FAIR_ORDERING_POLICY in CS
>>>>> > , probably you should be able to achieve  your goal i.e avoiding
>>>>> starving of
>>>>> > applications for resources.
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> org.apache.hadoop.yarn.server.resourcemanager.scheduler.policy.FairOrderingPolicy<S>
>>>>> >
>>>>> > An OrderingPolicy which orders SchedulableEntities for fairness
(see
>>>>> > FairScheduler FairSharePolicy), generally, processes with lesser
>>>>> usage are
>>>>> > lesser. If sizedBasedWeight is set to true then an application with
>>>>> high
>>>>> > demand may be prioritized ahead of an application with less usage.
>>>>> This is
>>>>> > to offset the tendency to favor small apps, which could result in
>>>>> starvation
>>>>> > for large apps if many small ones enter and leave the queue
>>>>> continuously
>>>>> > (optional, default false)
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > Community Issue Id :
>>>>> https://issues.apache.org/jira/browse/YARN-3463
>>>>> >
>>>>> >
>>>>> >
>>>>> > Thanks & Regards
>>>>> >
>>>>> > Rohith Sharma K S
>>>>> >
>>>>> >
>>>>> >
>>>>> > From: Laxman Ch [mailto:laxman.lux@gmail.com]
>>>>> > Sent: 29 September 2015 13:36
>>>>> > To: user@hadoop.apache.org
>>>>> > Subject: Re: Concurrency control
>>>>> >
>>>>> >
>>>>> >
>>>>> > Bouncing this thread again. Any other thoughts please?
>>>>> >
>>>>> >
>>>>> >
>>>>> > On 17 September 2015 at 23:21, Laxman Ch <laxman.lux@gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > No Naga. That wont help.
>>>>> >
>>>>> >
>>>>> >
>>>>> > I am running two applications (app1 - 100 vcores, app2 - 100 vcores)
>>>>> with
>>>>> > same user which runs in same queue (capacity=100vcores). In this
>>>>> scenario,
>>>>> > if app1 triggers first occupies all the slots and runs longs then
>>>>> app2 will
>>>>> > starve longer.
>>>>> >
>>>>> >
>>>>> >
>>>>> > Let me reiterate my problem statement. I wanted "to control the
>>>>> amount of
>>>>> > resources (vcores, memory) used by an application SIMULTANEOUSLY"
>>>>> >
>>>>> >
>>>>> >
>>>>> > On 17 September 2015 at 22:28, Naganarasimha Garla
>>>>> > <naganarasimha.gr@gmail.com> wrote:
>>>>> >
>>>>> > Hi Laxman,
>>>>> >
>>>>> > For the example you have stated may be we can do the following
>>>>> things :
>>>>> >
>>>>> > 1. Create/modify the queue with capacity and max cap set such that
>>>>> its
>>>>> > equivalent to 100 vcores. So as there is no elasticity, given
>>>>> application
>>>>> > will not be using the resources beyond the capacity configured
>>>>> >
>>>>> > 2. yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent
>>>>>  so that
>>>>> > each active user would be assured with the minimum guaranteed
>>>>> resources . By
>>>>> > default value is 100 implies no user limits are imposed.
>>>>> >
>>>>> >
>>>>> >
>>>>> > Additionally we can think of
>>>>> >
>>>>> "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage"
>>>>> > which will enforce strict cpu usage for a given container if
>>>>> required.
>>>>> >
>>>>> >
>>>>> >
>>>>> > + Naga
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thu, Sep 17, 2015 at 4:42 PM, Laxman Ch <laxman.lux@gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > Yes. I'm already using cgroups. Cgroups helps in controlling the
>>>>> resources
>>>>> > at container level. But my requirement is more about controlling
the
>>>>> > concurrent resource usage of an application at whole cluster level.
>>>>> >
>>>>> >
>>>>> >
>>>>> > And yes, we do configure queues properly. But, that won't help.
>>>>> >
>>>>> >
>>>>> >
>>>>> > For example, I have an application with a requirement of 1000
>>>>> vcores. But, I
>>>>> > wanted to control this application not to go beyond 100 vcores at
>>>>> any point
>>>>> > of time in the cluster/queue. This makes that application to run
>>>>> longer even
>>>>> > when my cluster is free but I will be able meet the guaranteed SLAs
>>>>> of other
>>>>> > applications.
>>>>> >
>>>>> >
>>>>> >
>>>>> > Hope this helps to understand my question.
>>>>> >
>>>>> >
>>>>> >
>>>>> > And thanks Narasimha for quick response.
>>>>> >
>>>>> >
>>>>> >
>>>>> > On 17 September 2015 at 16:17, Naganarasimha Garla
>>>>> > <naganarasimha.gr@gmail.com> wrote:
>>>>> >
>>>>> > Hi Laxman,
>>>>> >
>>>>> > Yes if cgroups are enabled and
>>>>> "yarn.scheduler.capacity.resource-calculator"
>>>>> > configured to DominantResourceCalculator then cpu and memory can
be
>>>>> > controlled.
>>>>> >
>>>>> > Please Kindly  furhter refer to the official documentation
>>>>> > http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html
>>>>> >
>>>>> >
>>>>> >
>>>>> > But may be if say more about problem then we can suggest ideal
>>>>> > configuration, seems like capacity configuration and splitting of
>>>>> the queue
>>>>> > is not rightly done or you might refer to Fair Scheduler if you
want
>>>>> more
>>>>> > fairness for container allocation for different apps.
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thu, Sep 17, 2015 at 4:10 PM, Laxman Ch <laxman.lux@gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > Hi,
>>>>> >
>>>>> >
>>>>> >
>>>>> > In YARN, do we have any way to control the amount of resources
>>>>> (vcores,
>>>>> > memory) used by an application SIMULTANEOUSLY.
>>>>> >
>>>>> >
>>>>> >
>>>>> > - In my cluster, noticed some large and long running mr-app occupied
>>>>> all the
>>>>> > slots of the queue and blocking other apps to get started.
>>>>> >
>>>>> > - I'm using Capacity schedulers (using hierarchical queues and
>>>>> preemption
>>>>> > disabled)
>>>>> >
>>>>> > - Using Hadoop version 2.6.0
>>>>> >
>>>>> > - Did some googling around this and gone through configuration docs
>>>>> but I'm
>>>>> > not able to find anything that matches my requirement.
>>>>> >
>>>>> >
>>>>> >
>>>>> > If needed, I can provide more details on the usecase and problem.
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> >
>>>>> > Thanks,
>>>>> > Laxman
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> >
>>>>> > Thanks,
>>>>> > Laxman
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> >
>>>>> > Thanks,
>>>>> > Laxman
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> >
>>>>> > Thanks,
>>>>> > Laxman
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>> Laxman
>>>>
>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Laxman
>>>
>>
>>
>>
>> --
>> Thanks,
>> Laxman
>>
>


-- 
Thanks,
Laxman

Mime
View raw message