cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kjetil Valstadsve (JIRA)" <>
Subject [jira] Updated: (CASSANDRA-2202) Better error reporting and meaningful toStrings
Date Sun, 20 Feb 2011 15:42:38 GMT


Kjetil Valstadsve updated CASSANDRA-2202:

    Attachment: errorreporting.txt

Various toStrings and exception rethrows.

> Better error reporting and meaningful toStrings
> -----------------------------------------------
>                 Key: CASSANDRA-2202
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.7.2
>            Reporter: Kjetil Valstadsve
>            Priority: Minor
>              Labels: exception, logging, stacktrace
>         Attachments: errorreporting.txt
> Having been through some trouble with the upgrade to 0.7.2 via 0.7.1, I've learned the
hard way that Cassandra's error reporting can be improved.  In particular, I had issues with
deserialization of bloom filters, and with some corrupt SSTables.
> Chatting with driftx on the IRC channel, I complained about lack of error reporting,
and showed him a stacktrace that indicated a corrupted SSTable. I then added various toString
implementations and some new try/catch/rethrow points in critical control flow, which revealed
the name of the SSTable.  (It turned out to be a hints table, allowing driftx to advise me
on the next step - namely delete the local tables and rely on read repair.) 
> I'm submitting a patch reflecting this work.  It addresses these issues, and is perhaps
best taken as a suggested guideline, maybe as an eye-opener. 
> I generally advocate two practices/good habits wrt. error reporting:
> 1) Provide a good toString, reflecting the important state of an instance.  (In this
case, file names carry valuable information.)
> 2) Trapping and reporting errors is more useful at some points of the control flow than
in others. For instance, at a point in the code where we have gathered all relevant parameters
and are about to embark on major I/O (like reading a table from disk).  In these points, a
try/catch/throw can be used to capture this information.  It's ok to catch Exception here,
if you ask me, as long as you wrap it as the cause of a new exception.  This new exception
should describe the most important parameters involved in the failed call.  To do this, it
should rely on the toStrings from item 1.
> 2a) The same goes for log messages.  Notice that the utility of a toString multiplies,
as it will emit useful information wherever it's used for logging and/or exception messages.
> By following these points, you will get stack traces partitioned up with a chain of "Caused
by" exceptions, each providing some information from its relevant layer in the call stack.
> I have applied this approach in the attached patch.  As a result, we were able to get
the file name from a stacktrace that originally just said "EOFException", and the problem
could be easily fixed. 
> Caveats:  Yes, you can get insanely long exception messages, and/or log messages.  To
keep the post-mortem sessions manageable, constant tuning and re-tuning of toStrings and log/exception
messages is needed. However, even a terrifyingly verbose exception won't be _less_ meaningful
to the user than a nameless EOFException, and it will convey more information to the developers.
 (Even if that information boils down to "this toString is way more verbose than it needs
to be".)
> In short: This patch helps with detecting corrupt SSTables and bogus bloom filers, and
I hope it inspires someone to give out more informative error messages as well. 

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message