Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id D3789200BBB for ; Thu, 27 Oct 2016 03:23:43 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id D1F42160B02; Thu, 27 Oct 2016 01:23:43 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 22C91160AEE for ; Thu, 27 Oct 2016 03:23:42 +0200 (CEST) Received: (qmail 48812 invoked by uid 500); 27 Oct 2016 01:23:42 -0000 Mailing-List: contact user-help@mesos.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@mesos.apache.org Delivered-To: mailing list user@mesos.apache.org Received: (qmail 48795 invoked by uid 99); 27 Oct 2016 01:23:42 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 27 Oct 2016 01:23:42 +0000 Received: from mail-lf0-f45.google.com (mail-lf0-f45.google.com [209.85.215.45]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 8136B1A0143 for ; Thu, 27 Oct 2016 01:23:41 +0000 (UTC) Received: by mail-lf0-f45.google.com with SMTP id b75so18731065lfg.3 for ; Wed, 26 Oct 2016 18:23:41 -0700 (PDT) X-Gm-Message-State: ABUngvd7iTIInQ7GUjqgnF9fJQWMahwjsKgse5IYFXS854a0HCER3uf+cjm4w05XLe5LAYqvx58jPDyA7r/YvfBs X-Received: by 10.25.155.211 with SMTP id d202mr3698949lfe.129.1477531419466; Wed, 26 Oct 2016 18:23:39 -0700 (PDT) MIME-Version: 1.0 Received: by 10.114.22.195 with HTTP; Wed, 26 Oct 2016 18:23:19 -0700 (PDT) In-Reply-To: <15892C36-5E0A-4729-A46F-C4742DFF3A8F@apple.com> References: <15892C36-5E0A-4729-A46F-C4742DFF3A8F@apple.com> From: Benjamin Mahler Date: Wed, 26 Oct 2016 18:23:19 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Design for Restartable Tasks To: dev Cc: user Content-Type: multipart/alternative; boundary=001a11401f1866f601053fce978d archived-at: Thu, 27 Oct 2016 01:23:44 -0000 --001a11401f1866f601053fce978d Content-Type: text/plain; charset=UTF-8 Thanks for publishing this! Saw some tickets being created and was wondering where this email was.. :) The higher level thing that strikes me is that I think the notion of a task restart policy should be managed by the executor (i.e. the executor restarts the task based on the policy). This is aligned with how the existing kill and health check policies work. This project seems to be something more along the lines of a restartable executor, alongside a change to perform agent recovery across reboot? Since this project is pretty complicated, it would be prudent to gather some committers to provide feedback and we can publish our notes to the lists. Ben On Wed, Oct 26, 2016 at 5:13 PM, Megha Sharma wrote: > Hi All, > > We have been working on the design to allow tasks which need to be > restarted on the agent post its restart. Looking forward to your > comments/feedback. > > Design Doc: > https://docs.google.com/document/d/1YS_EBUNLkzpSru0dwn_hPUIe > TATiWckSaosXSIaHUCo/edit#heading=h.tlevdyt3yv0a > > JIRA: > https://issues.apache.org/jira/browse/MESOS-3545 > > Many Thanks > Megha Sharma > > > > > --001a11401f1866f601053fce978d Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Thanks for publishing this! Saw some tickets being created= and was wondering where this email was.. :)

The higher = level thing that strikes me is that I think the notion of a task restart po= licy should be managed by the executor (i.e. the executor restarts the task= based on the policy). This is aligned with how the existing kill and healt= h check policies work. This project seems to be something more along the li= nes of a restartable executor, alongside a change to perform agent recovery= across reboot?

Since this project is pretty compl= icated, it would be prudent to gather some committers to provide feedback a= nd we can publish our notes to the lists.

Ben

On Wed, Oct 26, = 2016 at 5:13 PM, Megha Sharma <msharma3@apple.com> wrote:
Hi All,

We have been working on the design to allow tasks which need to be restarte= d on the agent post its restart. Looking forward to your comments/feedback.=

Design Doc:
https://docs.google.com/document/d/1YS_EBUNLkzpSru0dwn_hPUIe<= wbr>TATiWckSaosXSIaHUCo/edit#heading=3Dh.tlevdyt3yv0a

JIRA:
https://issues.apache.org/jira/browse/MESOS-354= 5

Many Thanks
Megha S= harma





--001a11401f1866f601053fce978d--