Return-Path: X-Original-To: apmail-aurora-dev-archive@minotaur.apache.org Delivered-To: apmail-aurora-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DD70910A23 for ; Wed, 18 Feb 2015 20:56:37 +0000 (UTC) Received: (qmail 93628 invoked by uid 500); 18 Feb 2015 20:56:37 -0000 Delivered-To: apmail-aurora-dev-archive@aurora.apache.org Received: (qmail 93577 invoked by uid 500); 18 Feb 2015 20:56:37 -0000 Mailing-List: contact dev-help@aurora.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@aurora.incubator.apache.org Delivered-To: mailing list dev@aurora.incubator.apache.org Received: (qmail 93566 invoked by uid 99); 18 Feb 2015 20:56:37 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Feb 2015 20:56:37 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.3] (HELO mail.apache.org) (140.211.11.3) by apache.org (qpsmtpd/0.29) with SMTP; Wed, 18 Feb 2015 20:56:36 +0000 Received: (qmail 91712 invoked by uid 99); 18 Feb 2015 20:56:15 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Feb 2015 20:56:15 +0000 Received: from mail-ie0-f171.google.com (mail-ie0-f171.google.com [209.85.223.171]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id B3BF51A0292 for ; Wed, 18 Feb 2015 20:56:15 +0000 (UTC) Received: by iecar1 with SMTP id ar1so4606790iec.0 for ; Wed, 18 Feb 2015 12:56:14 -0800 (PST) X-Gm-Message-State: ALoCoQk4pPf9JTFquIWHvcSETpJ42+xLQdao4baRDBG0pR8G+hmB1OpkZVo72zAyI8lCMRsi3Ogi MIME-Version: 1.0 X-Received: by 10.50.131.196 with SMTP id oo4mr2907583igb.2.1424292974721; Wed, 18 Feb 2015 12:56:14 -0800 (PST) Received: by 10.107.8.204 with HTTP; Wed, 18 Feb 2015 12:56:14 -0800 (PST) In-Reply-To: References: <6CFAD16F-C42C-415E-BC35-A4BA60CB8C9E@gmail.com> <2197832A-4BA8-4F30-A950-E98ED742A60A@gmail.com> Date: Wed, 18 Feb 2015 12:56:14 -0800 Message-ID: Subject: Re: [proposal] Deprecate the Thermos CLI From: Maxim Khutornenko To: dev@aurora.incubator.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Running "thermos status --verbosity=3D3" gives full thermos task history including the sandbox path and process table contents. This really saves time when trying to get to the failed task details or see what else is running on a host. On Wed, Feb 18, 2015 at 12:04 PM, Bill Farner wrote: > Can either of you elaborate on the type of debugging you currently > accomplish with this tool? > > On Wednesday, February 18, 2015, Brian Wickman wrote= : > >> I agree it is a valuable component. However, I think that until it has >> test coverage we should consider it an unsupported tool. Filed AURORA-1= 131 >> . This is already on >> my >> radar as part of AURORA-1027 >> . >> >> On Wed, Feb 18, 2015 at 9:19 AM, Maxim Khutornenko > > wrote: >> >> > > Moving parts should either provide value or be obliterated from our >> > source tree. >> > >> > I generally agree. In this particular case it's still unclear to me - >> > in the absence of Thermos CLI and Observer, how do we conduct live >> > site executor/thermos troubleshooting? >> > >> > On Tue, Feb 17, 2015 at 7:45 PM, Bill Farner > > wrote: >> > >> >> > >> I think we would be better served by advertising it as an >> > >> optional component that provides operators and users with debugging >> > >> ability. >> > > >> > > >> > > Slightly tangential discussion, but i think we should be very skepti= cal >> > of >> > > fringe components. Moving parts should either provide value or be >> > > obliterated from our source tree. >> > > >> > > -=3DBill >> > > >> > > On Tue, Feb 17, 2015 at 6:51 PM, Zameer Manji > > wrote: >> > > >> > >> One thing I would like to point out is the thermos CLI is not requi= red >> > for >> > >> Aurora operation. I think we would be better served by advertising = it >> > as an >> > >> optional component that provides operators and users with debugging >> > >> ability. >> > >> >> > >> On Tue, Feb 17, 2015 at 6:38 PM, Joseph Smith > > >> > wrote: >> > >> >> > >> > I believe it absolutely is- ideally as we deprecate the Observer,= we >> > can >> > >> > then lean on the Mesos Slave for this information instead. This w= ill >> > >> > further decrease the number of moving pieces, simplifying the >> > operation >> > >> of >> > >> > an Aurora/Mesos cluster. >> > >> > >> > >> > > On Feb 17, 2015, at 6:33 PM, Zameer Manji > > >> > wrote: >> > >> > > >> > >> > > Joe, >> > >> > > >> > >> > > If I understand Brian's proposal correctly < >> > >> > > >> > >> > >> > >> >> > >> http://mail-archives.apache.org/mod_mbox/aurora-dev/201501.mbox/%3CCAFTd= r0DZvH21tR=3DNLK0qP-Y9-oL9SyULy6GLah=3DCApuW0SVvnw@mail.gmail.com%3E >> > >> > >, >> > >> > > we are going to depreciate the Observer. This combined with you= r >> > >> proposal >> > >> > > will make the executor the only component that can read the >> thermos >> > >> > > checkpoints and produce some output that is human readable. Is >> that >> > >> > > something we want to do? >> > >> > > >> > >> > > On Tue, Feb 17, 2015 at 6:26 PM, Joseph Smith < >> yasumoto7@gmail.com > >> > >> > wrote: >> > >> > > >> > >> > >> Hi everyone, >> > >> > >> >> > >> > >> After reviewing the functionality offered by the Thermos >> > Commandline >> > >> > tool >> > >> > >> vs. what=E2=80=99s exported via the Thermos Observer, I was ho= ping to >> bring >> > >> up a >> > >> > >> question I had: >> > >> > >> >> > >> > >> Can we deprecate the Thermos CLI? >> > >> > >> >> > >> > >> Removing this would decrease the number of components required >> for >> > a >> > >> > >> functional Aurora installation (a huge victory, in my opinion) >> and >> > >> also >> > >> > >> enable the Observer to fully take over the duty of providing >> > >> visibility >> > >> > >> into what=E2=80=99s running on a most. In addition, maintenanc= e is >> > performed >> > >> via >> > >> > >> the HostMaintenance API < >> > >> > >> >> > >> > >> > >> >> > >> https://github.com/apache/incubator-aurora/blob/master/src/main/python/a= pache/aurora/admin/host_maintenance.py#L26 >> > >> > > >> > >> > >> and should not be done using thermos kill, which would cause L= OST >> > >> tasks. >> > >> > >> >> > >> > >> That said, removing this tool makes it much more difficult for >> > Thermos >> > >> > to >> > >> > >> be used as a monit replacement, whi= ch >> > is >> > >> > >> actually rather feasible now. In addition, it also forces peop= le >> to >> > >> > >> remember + learn the port the Observer is running on in order = to >> > get >> > >> > >> information about tasks. >> > >> > >> >> > >> > >> Any thoughts and opinions would be much appreciated! >> > >> > >> >> > >> > >> Thanks! >> > >> > >> Joe >> > >> > >> >> > >> > >> -- >> > >> > >> Zameer Manji >> > >> > >> >> > >> > >> >> > >> > >> > >> > -- >> > >> > Zameer Manji >> > >> > >> > >> > >> > >> >> > >> > > > -- > -=3DBill