aurora-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bill Farner (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AURORA-909) Differentiate between dynamic and static vetoes
Date Thu, 05 Feb 2015 00:10:34 GMT

    [ https://issues.apache.org/jira/browse/AURORA-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14306272#comment-14306272
] 

Bill Farner commented on AURORA-909:
------------------------------------

Sorry i neglected to acknowledge this.  One thing that is disrupted by skipping offers is
that we currently compute a "pending reason", so we try to find the nearest miss based on
some (fairly subjective) criteria.  That said, sub-linear complexity would be really nice.
 Would you mind peeling off a separate ticket with this idea?  I should be complementary to
the caching here, but we need to think harder about the downstream effects of skipping.

> Differentiate between dynamic and static vetoes
> -----------------------------------------------
>
>                 Key: AURORA-909
>                 URL: https://issues.apache.org/jira/browse/AURORA-909
>             Project: Aurora
>          Issue Type: Story
>          Components: Scheduler
>            Reporter: Bill Farner
>            Assignee: Maxim Khutornenko
>
> We're making a decent effort at reducing the _cost_ of task scheduling operations, abut
have not yet invested in reducing the working set in a way that causes task scheduling to
scale better.  Each scheduling attempt for each task is an O(n) operation, where n is the
number of offers.
> I would like to explore optimizations where we try to reduce the amount of redundant
work performed in task scheduling.  Say, for example, we're trying to schedule a task that
needs 2 CPUs, and we only have offers with 1 CPU.  Each scheduling round will re-assess every
offer, despite the fact that the offers have not changed shape, and will always be a mismatch
(hereafter termed _static_ mismatches).  Instead, we should try to skip over offers that are
a static mismatch.  We could do this at the {{TaskGroup}} level, since every element in a
task group is by definition statically equivalent.  This means that jobs with a large number
of instances could be scheduled very efficiently, since the first task scheduling round could
identify static mismatches, reducing the working set in the next round.
> This is to contrast with _dynamic_ mismatches, where a change in the tasks on a machine
or other settings could make a previously-ineligible offer become a match.  The current sources
of dynamic mismatches are limit constraints, host maintenance modes, and dedicated attributes.
> I propose we proceed in several steps, re-evaluating after each:
> 1. instrument the scheduler to better estimate the improvements
> 2. avoid future (offer, task group) evaluations when static mismatches are found
> 3. avoid future (offer, task group) evaluations when dynamic mismatches are found



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message