Return-Path: X-Original-To: apmail-aurora-dev-archive@minotaur.apache.org Delivered-To: apmail-aurora-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C240E18747 for ; Mon, 4 Apr 2016 20:47:59 +0000 (UTC) Received: (qmail 5593 invoked by uid 500); 4 Apr 2016 20:47:59 -0000 Delivered-To: apmail-aurora-dev-archive@aurora.apache.org Received: (qmail 5530 invoked by uid 500); 4 Apr 2016 20:47:59 -0000 Mailing-List: contact dev-help@aurora.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@aurora.apache.org Delivered-To: mailing list dev@aurora.apache.org Received: (qmail 5519 invoked by uid 99); 4 Apr 2016 20:47:59 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Apr 2016 20:47:59 +0000 Received: from mail-ob0-f175.google.com (mail-ob0-f175.google.com [209.85.214.175]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 3B79A1A0094 for ; Mon, 4 Apr 2016 20:47:59 +0000 (UTC) Received: by mail-ob0-f175.google.com with SMTP id x3so180357786obt.0 for ; Mon, 04 Apr 2016 13:47:59 -0700 (PDT) X-Gm-Message-State: AD7BkJKWn9i/U0vXE+qcEVlbw4hYdPB2rcp2X3Q4GAhCsvf3ELQxy9luUXGOjGIMuRe3OzKbRM9X0Uk7v2LH4A== MIME-Version: 1.0 X-Received: by 10.60.246.4 with SMTP id xs4mr1151907oec.58.1459802878489; Mon, 04 Apr 2016 13:47:58 -0700 (PDT) Received: by 10.202.193.213 with HTTP; Mon, 4 Apr 2016 13:47:58 -0700 (PDT) In-Reply-To: References: <1459508323332.10017@blue-yonder.com> <1459798968776.43791@blue-yonder.com> Date: Mon, 4 Apr 2016 13:47:58 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Are we ready to remove the observer? From: Bill Farner To: dev@aurora.apache.org Content-Type: multipart/alternative; boundary=001a1136857203df1d052faed8bc --001a1136857203df1d052faed8bc Content-Type: text/plain; charset=UTF-8 We clearly have different experiences - i've never really benefited from viewing the process graph, as most jobs have very simple sequences that could be easily explained by a text file in the sandbox. On the contrary, i've encountered people confused by the process graph, the observer, and sandbox browsing...so i must respectfully disagree that it is universally appreciated. What i'm trying to achieve is simplicity. The observer is an extra moving part, and another thing for operators to understand and maintain. It also couples Aurora to one relatively specific way of running tasks, which makes it difficult to open new use cases like Docker tasks. Removing the observer starts to pull on a thread of complexity that i don't think Aurora benefits much from, for example state checkpointing by the executor. My goal is not to apply pressure, but to perform a gut check. If the answer is "No", that's fine. On Mon, Apr 4, 2016 at 1:01 PM, Maxim Khutornenko wrote: > I am with Josh on this one. Thermos Observer UI (and especially its > process graph) is one of the features universally appreciated by our > customers. I am all for deprecating the Observer but only in way that > retains parity with the existing functionality and hopefully enhances > it. What are we trying to achieve here that would justify losing some > of our feature set? > > On Mon, Apr 4, 2016 at 12:42 PM, Erb, Stephan > wrote: > > Have you recently looked at the Mesos UI, Joshua? It offers sandbox > browsing similar to the chroot link of Thermos. So at least you don't have > to do SSH into any box. We could link to that Mesos UI instead of the > Thermos one, and Mesos could then serve a nice index.html that contains the > content that was formerly served by Thermos. > > > > When dropping Thermos and relying on Mesos instead, we could profit from > the recent addition such as authentication. > > > > > > ________________________________________ > > From: Joshua Cohen > > Sent: Monday, April 4, 2016 18:42 > > To: dev@aurora.apache.org > > Subject: Re: Are we ready to remove the observer? > > > > If you're suggesting just going to the task directory and pulling them > out > > of the executor logs. Yes, I could ssh into the host the task is running > on > > and grep the task directory out of the mesos agent logs and then trawl > the > > logs (or cat task.json), but that's much more effort than going to the > > observer's task UI (i.e. it'd take a minute, rather than a few seconds). > > I'd also posit that it's much easier for new Aurora operators to come to > > grips with the process tree via the UI rather than a JSON blob. > > > > If you're suggesting something else (i.e. new UI to expose these separate > > from the Observer), I'm fine with that, that's what I was implying above > > would be necessary before I think we could retire the Observer. > > > > A counter question: do people feel that updating/deploying the Observer > is > > a major obstacle? I know we've got the process well automated, so it's > > relatively painless. I'd love to replace the Observer with something > > better, but I don't feel like it's a major drag on our productivity as it > > exists today to warrant killing it off entirely. My opinion may be > colored > > by the deploy automation we have in place though! > > > > On Mon, Apr 4, 2016 at 9:32 AM, Bill Farner wrote: > > > >> > > >> > 2) Providing an easy view of a process's command-line > >> > 3) Providing a holistic view of the task config > >> > >> > >> Just to check my understanding - these could be trivially handled in > >> text/log format, right? > >> > >> On Mon, Apr 4, 2016 at 9:30 AM, Joshua Cohen wrote: > >> > >> > I'm -1 on this until we have an actual replacement for the Observer. I > >> > think that the observer provides significant value outside of just > >> sandbox > >> > browsing: > >> > > >> > 1) Exporting task-level statistics. > >> > 2) Providing an easy view of a process's command-line > >> > 3) Providing a holistic view of the task config > >> > 4) Real time utilization stats > >> > > >> > As a cluster operator, I use all of these features on a daily basis > >> > (especially when I'm on call) in addition to sandbox browsing, so I > don't > >> > think that these uses cases are that rare. > >> > > >> > On Fri, Apr 1, 2016 at 6:55 AM, Steve Niemitz > >> wrote: > >> > > >> > > The per-process stats have never been very useful to us (since they > >> don't > >> > > work for docker), however, even being able to see the processes that > >> are > >> > > running, how many times they've restarted, when they launched, etc > is > >> > > invaluable. > >> > > > >> > > I think there would be big pushback from users if they were to lose > the > >> > > functionality it provided currently (beyond log viewing). > >> > > > >> > > On Fri, Apr 1, 2016 at 6:58 AM, Erb, Stephan < > >> > Stephan.Erb@blue-yonder.com> > >> > > wrote: > >> > > > >> > > > From an operator and Aurora developer perspective, it would be > really > >> > > > great to get rid of the thermos observer quickly. > >> > > > > >> > > > However, from a user perspective the usability gap between > observer > >> and > >> > > > plain Mesos sandbox browsing is quite large right now. I agree > with > >> > > > Benjamin here that it would probably work if we generate html > pages > >> > ready > >> > > > for user consumption. > >> > > > > >> > > > These are the relevant tickets in our tracker: > >> > > > * https://issues.apache.org/jira/browse/AURORA-725 > >> > > > * https://issues.apache.org/jira/browse/AURORA-777 > >> > > > > >> > > > ________________________________________ > >> > > > From: benley@gmail.com > >> > > > Sent: Friday, April 1, 2016 02:35 > >> > > > To: dev@aurora.apache.org > >> > > > Subject: Re: Are we ready to remove the observer? > >> > > > > >> > > > Is there any chance we can keep the per-process cpu and ram > >> utilization > >> > > > stats? That's one of the coolest things about aurora, imo. The > >> > executor > >> > > > is already writing those checkpoints inside the mesos sandbox (I > >> > think?), > >> > > > so perhaps it could also produce the html pages that the observer > >> > > currently > >> > > > renders? > >> > > > > >> > > > On Thu, Mar 31, 2016 at 4:33 PM Zhitao Li > >> > wrote: > >> > > > > >> > > > > +1. > >> > > > > > >> > > > > On Thu, Mar 31, 2016 at 4:11 PM, Bill Farner < > wfarner@apache.org> > >> > > wrote: > >> > > > > > >> > > > > > Assuming that the vast majority of utility provided by the > >> observer > >> > > is > >> > > > > > sandbox/log browsing - can we remove it and link to sandbox > >> > browsing > >> > > > that > >> > > > > > mesos provides? > >> > > > > > > >> > > > > > The rest of the information could be (or already is) logged in > >> the > >> > > > > sandbox > >> > > > > > for the rare debugging scenarios that call for it. > >> > > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > -- > >> > > > > Cheers, > >> > > > > > >> > > > > Zhitao Li > >> > > > > > >> > > > > >> > > > >> > > >> > --001a1136857203df1d052faed8bc--