aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maxim Khutornenko <ma...@apache.org>
Subject Re: Error handling in the aurora client
Date Fri, 03 Oct 2014 16:13:31 GMT
+1 on that. Hides away the ugliness yet more accessible than searching
through client logs.

On Fri, Oct 3, 2014 at 5:51 AM, Mark Chu-Carroll <mchucarroll@apache.org> wrote:
> I like the proposal from John. Any objections to implementing that?
>
>       -Mark
>
> On Fri, Oct 3, 2014 at 2:23 AM, Joshua Cohen <jcohen@twopensource.com>
> wrote:
>
>> Came here to make the same suggestion John makes. What if we present
>> friendly error messages to the user, but write stack traces to a log file
>> that the user can upload in the event of unexpected/unhandled exceptions.
>> IMO the reason for not wanting to rely on users re-running commands with a
>> verbose flag to dump a stacktrace is that some errors will be transient and
>> not easily repeatable, thus the chance to capture the stack will be lost.
>>
>> On Thu, Oct 2, 2014 at 10:30 PM, John Sirois <john.sirois@gmail.com>
>> wrote:
>>
>> > Drive-by, but this has been on my mind with pants as well:  How about the
>> > current behavior but add a pill, ie:
>> > [ref:232e86a2d] Internal error executing command: 'str' object has no
>> > attribute 'err_msg'
>> >
>> > The full backtrace goes off to a file in the user's home dir somewhere
>> and
>> > then you can ask them to run a command passing the pill ref to get the
>> full
>> > error report without worry of re-running some non-idempotent command,
>> etc.
>> >
>> > On Thu, Oct 2, 2014 at 3:11 PM, Maxim Khutornenko <maxim@apache.org>
>> > wrote:
>> >
>> > > +1 on dumping the stack for unhandled errors as long as they are not
>> > > caused by KeyboardInterrupt. That would definitely help
>> > > troubleshooting transient errors when --reveal-errors is not a good
>> > > option.
>> > >
>> > > On Thu, Oct 2, 2014 at 1:19 PM, David McLaughlin <
>> david@dmclaughlin.com>
>> > > wrote:
>> > > > Because we allow things like hooks, I think it's best to err on the
>> > side
>> > > of
>> > > > overly verbose logging by default rather than have to ask client
>> users
>> > to
>> > > > rerun their command with an extra option just to get a stack trace.
>> > > >
>> > > >
>> > > > On Thu, Oct 2, 2014 at 1:10 PM, Mark Chu-Carroll <
>> > mchucarroll@apache.org
>> > > >
>> > > > wrote:
>> > > >
>> > > >> Can someone explain to me why providing an option to show the
stack
>> > > trace
>> > > >> is such a problem?
>> > > >>
>> > > >> Making our debugging easier shouldn't be an excuse for sloppy
>> tooling.
>> > > >> Dumping stacks at users because we didn't get our debugging right
>> > > shouldn't
>> > > >> be acceptable.
>> > > >>
>> > > >> The specific error here, where we've got a user writing python
code
>> > in a
>> > > >> config file is a special case: we're invoking a python
>> interpretation
>> > > >> process for the user, and if that crashes, they expect what they'd
>> get
>> > > by
>> > > >> running the python code manually. But in other places, allowing
>> people
>> > > to
>> > > >> request extra information as an option seems like a reasonable
>> > > compromise.
>> > > >>
>> > > >>     -Mark
>> > > >>
>> > > >>
>> > > >>
>> > > >>
>> > > >>
>> > > >> On Thu, Oct 2, 2014 at 3:51 PM, Kevin Sweeney
>> > > <ksweeney@twitter.com.invalid
>> > > >> >
>> > > >> wrote:
>> > > >>
>> > > >> > We can do both! I think we should dump a stack trace to the
>> console
>> > > >> > whenever we have an unhandled error, as we're not going to
be able
>> > to
>> > > >> debug
>> > > >> > it otherwise.
>> > > >> >
>> > > >> > We should also strive not to have *any* unhandled errors,
but that
>> > > does
>> > > >> not
>> > > >> > mean putting a catch-all exception handler at root, rather
it
>> means
>> > > >> having
>> > > >> > *specific* error messages for expected error conditions.
For
>> > example,
>> > > an
>> > > >> > IOError in a method to read a config file might translate
to an
>> > error
>> > > >> > message "Unable to read config file: '%s': %s." % (e.filename,
>> > > >> e.strerror)
>> > > >> > and a specific exit code. So this might manifest as
>> > > >> >
>> > > >> > % aurora job create devcluster/web/test/webserver typo.aurora
>> > > >> > ERROR: Unable to read config file 'typo.aurora': No such
file or
>> > > >> directory.
>> > > >> > % echo $?
>> > > >> > 3
>> > > >> >
>> > > >> > If the client code (including the support classes) isn't
factored
>> to
>> > > >> allow
>> > > >> > exception handling like this, it needs to be refactored.
>> > > >> >
>> > > >> > Also given that the context of this is AURORA-779 I think
it's
>> > totally
>> > > >> > reasonable to throw a stack trace to someone whose .aurora
file
>> > > raised an
>> > > >> > exception (since they are writing python they should get
the tools
>> > > needed
>> > > >> > to debug python).
>> > > >> >
>> > > >> > On Thu, Oct 2, 2014 at 12:27 PM, Mark Chu-Carroll <
>> > > >> mchucarroll@apache.org>
>> > > >> > wrote:
>> > > >> >
>> > > >> > > As we promote clientv2 and deprecate v1, we've come
across some
>> > > issues
>> > > >> > > involving error handling in the v2 client.
>> > > >> > >
>> > > >> > > When there's an unexpected error in clientv1, most of
the time,
>> it
>> > > >> > crashes
>> > > >> > > and dumps its stack. Dumping stack is a lousy user experience,
>> but
>> > > it
>> > > >> > > proves the stack dump, which users can then include
in a bug
>> > report.
>> > > >> > >
>> > > >> > > The default behavior in clientv2 doesn't dump stack.
Instead, it
>> > > >> catches
>> > > >> > > the unknown error, and prints out a concise error message,
like:
>> > > >> > >
>> > > >> > > Internal error executing command: 'str' object has no
attribute
>> > > >> 'err_msg'
>> > > >> > >
>> > > >> > >
>> > > >> > > There's no stack dump, so when we get an error report,
it's
>> harder
>> > > for
>> > > >> us
>> > > >> > > to track down the cause of the error.
>> > > >> > >
>> > > >> > > Clientv2 does provide a command-line option, "--reveal-errors",
>> > > which
>> > > >> > > allows errors to be propagated and eventually result
in a stack
>> > > trace.
>> > > >> > >
>> > > >> > > So: should we allow the client to dump stack on error?
>> > > >> > >
>> > > >> > >     -Mark
>> > > >> > >
>> > > >> >
>> > > >> >
>> > > >> >
>> > > >> > --
>> > > >> > Kevin Sweeney
>> > > >> > @kts
>> > > >> >
>> > > >>
>> > >
>> >
>>

Mime
View raw message