aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rick Mangi <r...@chartbeat.com>
Subject Re: schedule task instances spreading them based on a host attribute.
Date Thu, 30 Mar 2017 19:41:53 GMT
we’re using cgroups, if that’s what you’re asking :)


> On Mar 30, 2017, at 3:21 PM, Zameer Manji <zmanji@apache.org> wrote:
> 
> What kind of isolation features are you using?
> 
> I would like to probe a little deeper here, because this is not an ideal
> rationale for changing the placement algorithm. Ideally Mesos and Linux
> provides the right isolation technology to make this a non problem.
> 
> I understand the push for job anti-affinity (ie don't put too many kafka
> workers in general on one host), but I would imagine it would be for
> reliability reasons not for performance reasons.
> 
> On Thu, Mar 30, 2017 at 12:16 PM, Rick Mangi <rick@chartbeat.com> wrote:
> 
>> Performance and utilization mostly. The kafka consumers are CPU bound (and
>> sometimes network) and the rest of our jobs are mostly memory bound. We’ve
>> found that if too many consumers wind up on the same EC2 instance they
>> don’t perform as well. It’s hard to prove this, but the gut feeling is
>> pretty strong.
>> 
>> 
>>> On Mar 30, 2017, at 2:35 PM, Zameer Manji <zmanji@apache.org> wrote:
>>> 
>>> Rick,
>>> 
>>> Can you share why it would be nice to spread out these different jobs on
>>> different hosts? Is it for reliability, performance, utilization, etc?
>>> 
>>> On Thu, Mar 30, 2017 at 11:31 AM, Rick Mangi <rick@chartbeat.com> wrote:
>>> 
>>>> Yeah, we have a dozen or so kafka consumer jobs running in our cluster,
>>>> each having about 40 or so instances.
>>>> 
>>>> 
>>>>> On Mar 30, 2017, at 2:06 PM, David McLaughlin <david@dmclaughlin.com>
>>>> wrote:
>>>>> 
>>>>> There is absolutely a need for custom hook points in the scheduler
>>>> (injecting default constraints to running tasks for example). I don't
>> think
>>>> users should be asked to write custom scheduling algorithms to solve the
>>>> problems in this thread though. There are also huge downsides to
>> exposing
>>>> the internals of scheduling as a part of a plugin API.
>>>>> 
>>>>> Out of curiosity do your Kafka consumers span multiple jobs? Otherwise
>>>> host constraints solve that problem right?
>>>>> 
>>>>>> On Mar 30, 2017, at 10:34 AM, Rick Mangi <rick@chartbeat.com>
wrote:
>>>>>> 
>>>>>> I think the complexity is a great rationale for having a pluggable
>>>> scheduling layer. Aurora is very flexible and people use it in many
>>>> different ways. Giving users more flexibility in how jobs are scheduled
>>>> seems like it would be a good direction for the project.
>>>>>> 
>>>>>> 
>>>>>>> On Mar 30, 2017, at 12:16 PM, David McLaughlin <
>> dmclaughlin@apache.org>
>>>> wrote:
>>>>>>> 
>>>>>>> I think this is more complicated than multiple scheduling algorithms.
>>>> The
>>>>>>> problem you'll end up having if you try to solve this in the
>> Scheduling
>>>>>>> loop is when resources are unavailable because there are preemptible
>>>> tasks
>>>>>>> running in them, rather than hosts being down. Right now the
fact
>> that
>>>> the
>>>>>>> task cannot be scheduled is important because it triggers preemption
>>>> and
>>>>>>> will make room. An alternative algorithm that tries at all costs
to
>>>>>>> schedule the task in the TaskAssigner could decide to place the
task
>>>> in a
>>>>>>> non-ideal slot and leave a preemptible task running instead.
>>>>>>> 
>>>>>>> It's also important to think of the knock-on effects here when
we
>> move
>>>> to
>>>>>>> offer affinity (i.e. the current Dynamic Reservation proposal).
If
>>>> you've
>>>>>>> made this non-ideal compromise to get things scheduled - that
>> decision
>>>> will
>>>>>>> basically be permanent until the host you're on goes down. At
least
>>>> with
>>>>>>> how things work now, with each scheduling attempt the job has
a fresh
>>>>>>> chance of being put in an ideal slot.
>>>>>>> 
>>>>>>>> On Thu, Mar 30, 2017 at 8:12 AM, Rick Mangi <rick@chartbeat.com>
>>>> wrote:
>>>>>>>> 
>>>>>>>> Sorry for the late reply, but I wanted to chime in here as
wanting
>> to
>>>> see
>>>>>>>> this feature. We run a medium size cluster (around 1000 cores)
in
>> EC2
>>>> and I
>>>>>>>> think we could get better usage of the cluster with more
control
>> over
>>>> the
>>>>>>>> distribution of job instances. For example it would be nice
to limit
>>>> the
>>>>>>>> number of kafka consumers running on the same physical box.
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> 
>>>>>>>> Rick
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On 2017-03-06 14:44 (-0400), Mauricio Garavaglia <m...@gmail.com>
>>>> wrote:
>>>>>>>>> Hello!>
>>>>>>>>> 
>>>>>>>>> I have a job that have multiple instances (>100) that'd
I like to
>>>> spread>
>>>>>>>>> across the hosts in a cluster. Using a constraint such
as
>>>> "limit=host:1">
>>>>>>>>> doesn't work quite well, as I have more instances than
nodes.>
>>>>>>>>> 
>>>>>>>>> As a workaround I increased the limit value to something
like>
>>>>>>>>> ceil(instances/nodes). But now the problem happens if
a bunch of
>>>> nodes
>>>>>>>> go>
>>>>>>>>> down (think a whole rack dies) because the instances
will not run
>>>> until>
>>>>>>>>> them are back, even though we may have spare capacity
on the rest
>> of
>>>> the>
>>>>>>>>> hosts that we'd like to use. In that scenario, the job
availability
>>>> may
>>>>>>>> be>
>>>>>>>>> affected because it's running with fewer instances than
expected.
>> On
>>>> a>
>>>>>>>>> smaller scale, the former approach would also apply if
you want to
>>>>>>>> spread>
>>>>>>>>> tasks in racks or availability zones. I'd like to have
one instance
>>>> of a>
>>>>>>>>> job per rack (failure domain) but in the case of it going
down,
>> the>
>>>>>>>>> instance can be spawn on a different rack.>
>>>>>>>>> 
>>>>>>>>> I thought we could have a scheduling constraint to "spread"
>>>> instances>
>>>>>>>>> across a particular host attribute; instead of vetoing
an offer
>> right
>>>>>>>> away>
>>>>>>>>> we check where the other instances of a task are running,
looking
>>>> for a>
>>>>>>>>> particular attribute of the host. We try to maximize
the different
>>>>>>>> values>
>>>>>>>>> of a particular attribute (rack, hostname, etc) on the
task
>>>> instances>
>>>>>>>>> assignment.>
>>>>>>>>> 
>>>>>>>>> what do you think? did something like this came up in
the past? is
>>>> it>
>>>>>>>>> feasible?>
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Mauricio>
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>>> --
>>>> Zameer Manji
>>>> 
>> 
>> --
>> Zameer Manji
>> 


Mime
View raw message