Return-Path: X-Original-To: apmail-aurora-dev-archive@minotaur.apache.org Delivered-To: apmail-aurora-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8E0A117551 for ; Fri, 3 Oct 2014 16:17:54 +0000 (UTC) Received: (qmail 41975 invoked by uid 500); 3 Oct 2014 16:17:54 -0000 Delivered-To: apmail-aurora-dev-archive@aurora.apache.org Received: (qmail 41926 invoked by uid 500); 3 Oct 2014 16:17:54 -0000 Mailing-List: contact dev-help@aurora.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@aurora.incubator.apache.org Delivered-To: mailing list dev@aurora.incubator.apache.org Received: (qmail 41915 invoked by uid 99); 3 Oct 2014 16:17:54 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Oct 2014 16:17:54 +0000 X-ASF-Spam-Status: No, hits=-2000.6 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.3] (HELO mail.apache.org) (140.211.11.3) by apache.org (qpsmtpd/0.29) with SMTP; Fri, 03 Oct 2014 16:17:52 +0000 Received: (qmail 41817 invoked by uid 99); 3 Oct 2014 16:17:32 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Oct 2014 16:17:32 +0000 Received: from localhost (HELO mail-wi0-f175.google.com) (127.0.0.1) (smtp-auth username maxim, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Oct 2014 16:17:32 +0000 Received: by mail-wi0-f175.google.com with SMTP id d1so7638420wiv.8 for ; Fri, 03 Oct 2014 09:17:30 -0700 (PDT) X-Gm-Message-State: ALoCoQmzIe8OhucXBVT4Sd01EwCzG1l/H/QgHr7rMV41Mx4efMYWYqoj2nRsiabdeIeycZOZzGhR MIME-Version: 1.0 X-Received: by 10.180.11.234 with SMTP id t10mr13445005wib.49.1412352811925; Fri, 03 Oct 2014 09:13:31 -0700 (PDT) Received: by 10.216.155.200 with HTTP; Fri, 3 Oct 2014 09:13:31 -0700 (PDT) In-Reply-To: References: Date: Fri, 3 Oct 2014 09:13:31 -0700 Message-ID: Subject: Re: Error handling in the aurora client From: Maxim Khutornenko To: dev@aurora.incubator.apache.org Content-Type: text/plain; charset=UTF-8 X-Virus-Checked: Checked by ClamAV on apache.org +1 on that. Hides away the ugliness yet more accessible than searching through client logs. On Fri, Oct 3, 2014 at 5:51 AM, Mark Chu-Carroll wrote: > I like the proposal from John. Any objections to implementing that? > > -Mark > > On Fri, Oct 3, 2014 at 2:23 AM, Joshua Cohen > wrote: > >> Came here to make the same suggestion John makes. What if we present >> friendly error messages to the user, but write stack traces to a log file >> that the user can upload in the event of unexpected/unhandled exceptions. >> IMO the reason for not wanting to rely on users re-running commands with a >> verbose flag to dump a stacktrace is that some errors will be transient and >> not easily repeatable, thus the chance to capture the stack will be lost. >> >> On Thu, Oct 2, 2014 at 10:30 PM, John Sirois >> wrote: >> >> > Drive-by, but this has been on my mind with pants as well: How about the >> > current behavior but add a pill, ie: >> > [ref:232e86a2d] Internal error executing command: 'str' object has no >> > attribute 'err_msg' >> > >> > The full backtrace goes off to a file in the user's home dir somewhere >> and >> > then you can ask them to run a command passing the pill ref to get the >> full >> > error report without worry of re-running some non-idempotent command, >> etc. >> > >> > On Thu, Oct 2, 2014 at 3:11 PM, Maxim Khutornenko >> > wrote: >> > >> > > +1 on dumping the stack for unhandled errors as long as they are not >> > > caused by KeyboardInterrupt. That would definitely help >> > > troubleshooting transient errors when --reveal-errors is not a good >> > > option. >> > > >> > > On Thu, Oct 2, 2014 at 1:19 PM, David McLaughlin < >> david@dmclaughlin.com> >> > > wrote: >> > > > Because we allow things like hooks, I think it's best to err on the >> > side >> > > of >> > > > overly verbose logging by default rather than have to ask client >> users >> > to >> > > > rerun their command with an extra option just to get a stack trace. >> > > > >> > > > >> > > > On Thu, Oct 2, 2014 at 1:10 PM, Mark Chu-Carroll < >> > mchucarroll@apache.org >> > > > >> > > > wrote: >> > > > >> > > >> Can someone explain to me why providing an option to show the stack >> > > trace >> > > >> is such a problem? >> > > >> >> > > >> Making our debugging easier shouldn't be an excuse for sloppy >> tooling. >> > > >> Dumping stacks at users because we didn't get our debugging right >> > > shouldn't >> > > >> be acceptable. >> > > >> >> > > >> The specific error here, where we've got a user writing python code >> > in a >> > > >> config file is a special case: we're invoking a python >> interpretation >> > > >> process for the user, and if that crashes, they expect what they'd >> get >> > > by >> > > >> running the python code manually. But in other places, allowing >> people >> > > to >> > > >> request extra information as an option seems like a reasonable >> > > compromise. >> > > >> >> > > >> -Mark >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> On Thu, Oct 2, 2014 at 3:51 PM, Kevin Sweeney >> > > > > > >> > >> > > >> wrote: >> > > >> >> > > >> > We can do both! I think we should dump a stack trace to the >> console >> > > >> > whenever we have an unhandled error, as we're not going to be able >> > to >> > > >> debug >> > > >> > it otherwise. >> > > >> > >> > > >> > We should also strive not to have *any* unhandled errors, but that >> > > does >> > > >> not >> > > >> > mean putting a catch-all exception handler at root, rather it >> means >> > > >> having >> > > >> > *specific* error messages for expected error conditions. For >> > example, >> > > an >> > > >> > IOError in a method to read a config file might translate to an >> > error >> > > >> > message "Unable to read config file: '%s': %s." % (e.filename, >> > > >> e.strerror) >> > > >> > and a specific exit code. So this might manifest as >> > > >> > >> > > >> > % aurora job create devcluster/web/test/webserver typo.aurora >> > > >> > ERROR: Unable to read config file 'typo.aurora': No such file or >> > > >> directory. >> > > >> > % echo $? >> > > >> > 3 >> > > >> > >> > > >> > If the client code (including the support classes) isn't factored >> to >> > > >> allow >> > > >> > exception handling like this, it needs to be refactored. >> > > >> > >> > > >> > Also given that the context of this is AURORA-779 I think it's >> > totally >> > > >> > reasonable to throw a stack trace to someone whose .aurora file >> > > raised an >> > > >> > exception (since they are writing python they should get the tools >> > > needed >> > > >> > to debug python). >> > > >> > >> > > >> > On Thu, Oct 2, 2014 at 12:27 PM, Mark Chu-Carroll < >> > > >> mchucarroll@apache.org> >> > > >> > wrote: >> > > >> > >> > > >> > > As we promote clientv2 and deprecate v1, we've come across some >> > > issues >> > > >> > > involving error handling in the v2 client. >> > > >> > > >> > > >> > > When there's an unexpected error in clientv1, most of the time, >> it >> > > >> > crashes >> > > >> > > and dumps its stack. Dumping stack is a lousy user experience, >> but >> > > it >> > > >> > > proves the stack dump, which users can then include in a bug >> > report. >> > > >> > > >> > > >> > > The default behavior in clientv2 doesn't dump stack. Instead, it >> > > >> catches >> > > >> > > the unknown error, and prints out a concise error message, like: >> > > >> > > >> > > >> > > Internal error executing command: 'str' object has no attribute >> > > >> 'err_msg' >> > > >> > > >> > > >> > > >> > > >> > > There's no stack dump, so when we get an error report, it's >> harder >> > > for >> > > >> us >> > > >> > > to track down the cause of the error. >> > > >> > > >> > > >> > > Clientv2 does provide a command-line option, "--reveal-errors", >> > > which >> > > >> > > allows errors to be propagated and eventually result in a stack >> > > trace. >> > > >> > > >> > > >> > > So: should we allow the client to dump stack on error? >> > > >> > > >> > > >> > > -Mark >> > > >> > > >> > > >> > >> > > >> > >> > > >> > >> > > >> > -- >> > > >> > Kevin Sweeney >> > > >> > @kts >> > > >> > >> > > >> >> > > >> > >>