hadoop-zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: Solitication for logging/debugging requirements
Date Mon, 29 Mar 2010 21:50:05 GMT
Take a look at the logging page in the docs:
http://hadoop.apache.org/zookeeper/docs/current/zookeeperInternals.html#sc_logging

Some good guidelines in there. Basically we log things at info level 
that are interesting/informational but not logged so frequently that 
they fill the log. WARN is for things that are bad but that we can 
handle (like network connectivity failure). ERROR is generally for 
things we don't expect and are unlikely we can handle. FATAL means 
really bad, we shutdown the server. Many end users log only at WARN 
level or higher in production, so typically we err on the side of WARN 
for issues (so that we have a shot at debugging after the fact). Over 
time, as we gain confidence in production environments, we've been 
pushing more things that were WARN down to INFO.

I fixed a number of JIRAs for 3.3 related to logging. In particular I 
cleaned up the client session logging significantly. The most fertile 
area right now to cleanup logging is in the quorum code. That code in 
particular has issues wrt providing sufficient information to debug 
error conditions. You can easily see this by starting an ensemble of 
greater than 1 machine and try killing one/more of the servers. There 
are many places where the logging is insufficient (eg. "got vote", which 
doesn't say what the vote was or what the effect of such a vote is, 
etc...) Having improved logging in this area would really help.

Try searching on the JIRA
https://issues.apache.org/jira/browse/ZOOKEEPER
of open/closed issues re "log4j" or "logging" or "log" for further insight.

Patrick

Benjamin Reed wrote:
> awesome! that would be great ivan. i'm sure pat has some more concrete 
> suggestions, but one simple thing to do is to run the unit tests and 
> look at the log messages that get output. there are a couple of 
> categories of things that need to be fixed (this is in no way exhaustive):
> 
> 1) messages that have useful information, but only if you look in the 
> code to figure out what it means. there are some leader election 
> messages that fall into this category. it would be nice to clarify them.
> 2) there are error messages that really aren't errors. when shutting 
> down there are a bunch of errors that are expected, but still logged, 
> for example.
> 3) misclassified error levels
> 
> welcome aboard!
> 
> ben
> 
> On 03/29/2010 10:07 AM, Ivan Kelly wrote:
>> Hi,
>>
>> Im going to be using Zookeeper quite extensively for a project in a
>> few weeks, but development hasn't kicked off yet. This means I have
>> some time on my hands and I'd like to get familiar with zookeeper
>> beforehand by perhaps writing some tools to make debugging problems
>> with it easier so as to save myself some time in the future. Problem
>> is I haven't had to debug many zookeeper problems yet, so I don't know
>> where the pain points are.
>>
>> So, without further ado,
>>     - Are there any places that logging is deficient that sorely needs
>> improvement?
>>     - Could current logs be improved any amount or presented in a more
>> readable fashion?
>>     - Would some form of log visualisation be useful (for example in
>> something approximating a sequence diagram)?
>>
>> Feel free to suggest anything which the list above doesn't allude to
>> which you think would be helpful.
>>
>> Cheers,
>> Ivan
>>     
>>    
> 

Mime
View raw message