Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id F0E33200BCB for ; Thu, 10 Nov 2016 00:09:39 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id EF723160AFD; Wed, 9 Nov 2016 23:09:39 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 43E70160AFA for ; Thu, 10 Nov 2016 00:09:39 +0100 (CET) Received: (qmail 37658 invoked by uid 500); 9 Nov 2016 23:09:38 -0000 Mailing-List: contact dev-help@aurora.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@aurora.apache.org Delivered-To: mailing list dev@aurora.apache.org Received: (qmail 37647 invoked by uid 99); 9 Nov 2016 23:09:38 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Nov 2016 23:09:38 +0000 Received: from mail-yb0-f176.google.com (mail-yb0-f176.google.com [209.85.213.176]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id AA2F51A0329 for ; Wed, 9 Nov 2016 23:09:37 +0000 (UTC) Received: by mail-yb0-f176.google.com with SMTP id a184so2166252ybb.0 for ; Wed, 09 Nov 2016 15:09:37 -0800 (PST) X-Gm-Message-State: ABUngvdRvcyU01hxk0UpEcWuD6PzH3oCy14bfyrEvUujka/o7lKnimT/FghJtHbgojRIeVOzZpAacFnarm5ZxS4g X-Received: by 10.37.161.234 with SMTP id a97mr2341864ybi.119.1478732975924; Wed, 09 Nov 2016 15:09:35 -0800 (PST) MIME-Version: 1.0 Received: by 10.83.46.7 with HTTP; Wed, 9 Nov 2016 15:09:15 -0800 (PST) From: Zameer Manji Date: Wed, 9 Nov 2016 15:09:15 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: A sketch for supporting mesos maintenance To: dev@aurora.apache.org Content-Type: multipart/alternative; boundary=f403045c6482c03d850540e65909 archived-at: Wed, 09 Nov 2016 23:09:40 -0000 --f403045c6482c03d850540e65909 Content-Type: text/plain; charset=UTF-8 Hey, This is not a design doc for supporting Mesos Maintenance, but more of a high level overview on how we *could* support it going forward. I just wanted to get this idea out there now to see where we all stand. As Ankit mentioned in AURORA-1800 Mesos has had Maintenance primitives since 0.25. You can read about them here . The primitives map pretty well to our existing concept of maintenance, but they allow operators to do work across multiple frameworks. Since the Mesos community is growing and new frameworks are emerging all the time, I think Aurora should support these primitives and drop our custom primitives to be a better player in the ecosystem. We cannot adopt these just yet however, because it is only accessible behind the Mesos HTTP API which Aurora does not use today. Further, `aurora_admin` has some SLA aware maintenance processes which are computed and coordinated from the client. I think for us to successfully adopt Mesos Maintenance, we need to do at least two things: 1. Adopt the Mesos HTTP API. 2. Move the SLA aware maintenance logic from the admin tool into the scheduler itself, so the scheduler can coordinate with the Mesos Master in an SLA aware fashion. What do folks think? -- Zameer Manji --f403045c6482c03d850540e65909--