aurora-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bill Farner (JIRA)" <j...@apache.org>
Subject [jira] [Created] (AURORA-909) Make task scheduling more efficient
Date Fri, 31 Oct 2014 23:31:37 GMT
Bill Farner created AURORA-909:
----------------------------------

             Summary: Make task scheduling more efficient
                 Key: AURORA-909
                 URL: https://issues.apache.org/jira/browse/AURORA-909
             Project: Aurora
          Issue Type: Story
          Components: Scheduler
            Reporter: Bill Farner


We're making a decent effort at reducing the _cost_ of task scheduling operations, abut have
not yet invested in reducing the working set in a way that causes task scheduling to scale
better.  Each scheduling attempt for each task is an O(n) operation, where n is the number
of offers.

I would like to explore optimizations where we try to reduce the amount of redundant work
performed in task scheduling.  Say, for example, we're trying to schedule a task that needs
2 CPUs, and we only have offers with 1 CPU.  Each scheduling round will re-assess every offer,
despite the fact that the offers have not changed shape, and will always be a mismatch (hereafter
termed _static_ mismatches).  Instead, we should try to skip over offers that are a static
mismatch.  We could do this at the {{TaskGroup}} level, since every element in a task group
is by definition statically equivalent.  This means that jobs with a large number of instances
could be scheduled very efficiently, since the first task scheduling round could identify
static mismatches, reducing the working set in the next round.

This is to contrast with _dynamic_ mismatches, where a change in the tasks on a machine or
other settings could make a previously-ineligible offer become a match.  The current sources
of dynamic mismatches are limit constraints, host maintenance modes, and dedicated attributes.

I propose we proceed in several steps, re-evaluating after each:
1. instrument the scheduler to better estimate the improvements
2. avoid future (offer, task group) evaluations when static mismatches are found
3. avoid future (offer, task group) evaluations when dynamic mismatches are found



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message