mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Lambert (JIRA)" <>
Subject [jira] [Updated] (MESOS-354) Oversubscribe resources
Date Mon, 11 May 2015 21:25:00 GMT


Chris Lambert updated MESOS-354:
    Epic Colour: ghx-label-4

> Oversubscribe resources
> -----------------------
>                 Key: MESOS-354
>                 URL:
>             Project: Mesos
>          Issue Type: Epic
>          Components: isolation, master, slave
>            Reporter: brian wickman
>            Priority: Minor
>              Labels: mesosphere, twitter
>         Attachments: mesos_virtual_offers.pdf
> This proposal is predicated upon offer revocation.
> The idea would be to add a new "revoked" status either by (1) piggybacking off an existing
status update (TASK_LOST or TASK_KILLED) or (2) introducing a new status update TASK_REVOKED.
> In order to augment an offer with metadata about revocability, there are options:
>   1) Add a revocable boolean to the Offer and
>     a) offer only one type of Offer per slave at a particular time
>     b) offer both revocable and non-revocable resources at the same time but require
frameworks to understand that Offers can contain overlapping resources
>   2) Add a revocable_resources field on the Offer which is a superset of the regular
resources field.  By consuming > resources <= revocable_resources in a launchTask, the
Task becomes a revocable task.  If launching a task with < resources, the Task is non-revocable.
> The use cases for revocable tasks are batch tasks (e.g. hadoop/pig/mapreduce) and non-revocable
tasks are online higher-SLA tasks (e.g. services.)
> Consider a non-revocable that asks for 4 cores, 8 GB RAM and 20 GB of disk.  One of these
resources is a rate (4 cpu seconds per second) and two of them are fixed values (8GB and 20GB
respectively, though disk resources can be further broken down into spindles - fixed - and
iops - a rate.)  In practice, these are the maximum resources in the respective dimensions
that this task will use.  In reality, we provision tasks at some factor below peak, and only
hit peak resource consumption in rare circumstances or perhaps at a diurnal peak.  
> In the meantime, we stand to gain from offering the some constant factor of the difference
between (reserved - actual) of non-revocable tasks as revocable resources, depending upon
our tolerance for revocable task churn.  The main challenge is coming up with an accurate
short / medium / long-term prediction of resource consumption based upon current behavior.
> In many cases it would be OK to be sloppy:
>   * CPU / iops / network IO are rates (compressible) and can often be OK below guarantees
for brief periods of time while task revocation takes place
>   * Memory slack can be provided by enabling swap and dynamically setting swap paging
boundaries.  Should swap ever be activated, that would be a signal to revoke.
> The master / allocator would piggyback on the slave heartbeat mechanism to learn of the
amount of revocable resources available at any point in time.

This message was sent by Atlassian JIRA

View raw message