accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From William Slacum <wilhelm.von.cl...@accumulo.net>
Subject Re: increasing balancing problems to WARN
Date Sat, 19 Apr 2014 04:03:27 GMT
We could consider the use of markers to throw in more metadata about the
relevance of a particular log message.


On Fri, Apr 18, 2014 at 10:46 PM, Sean Busbey <busbey@cloudera.com> wrote:

> I also try to limit what goes at higher warning levels.  One of my goals
> over hte next few months is to improve our current logging. It sounds like
> this is a good time to make sure we're on the same page.
>
> We're going to have to train users on something (esp since our currently
> logging is very noisy). The short version I like is "Info and more severe
> are for operators; info and less severe are for developers."
>
> Here's what I usually use as a guideline (constrained to slf4j levels):
>
>
> = ERROR
>
> Something is wrong and an operator needs to do something, preferably very
> soon. In other words, if I was on call I'd expect to get paged.
>
> = WARN
>
> Something is amiss, but not of immediate concern. An operator who is on
> call but not busy at the moment might want to investigate some kind of
> underlying issue, but the system will continue to function within some
> reasonable bound.
>
> = INFO
>
> Summary information about normal operations that is safe to ignore. GC
> information, throughput stats, that kind of thing.
>
> = DEBUG
>
> Low level information that is not normally useful, but will help determine
> the cause of a system malfunction. Usually something a developer or tier 3
> supporter would want when something was going wrong (e.g. stack traces).
>
> = TRACE
>
> Detailed low level information at a volume that probably can't be gathered
> in production.
>
>
> Eric, do those all sound reasonable? I want to make sure we have a common
> basis before I get into the specifics of this case.
>
> -Sean
>
> On Fri, Apr 18, 2014 at 8:21 PM, Eric Newton <eric.newton@gmail.com>
> wrote:
>
> > -1
> >
> > I would hesitate to put *any* message at WARN. It is normal for balancing
> > to take a little while, especially for some of my users who have their
> own
> > balancing algorithm.
> >
> > Users feel the need to fix the problem; after all, it's there in big
> scary
> > yellow on the monitor page.   I don't like training users to ignore scary
> > yellow.  Is it a problem, or not?
> >
> > Alternatively, put the balance info into the master status, and display
> it.
> >  Like GC collection time... hey, I've been migrating these tablets for a
> > long time... turn yellow/red.
> >
> > -Eric
> >
> >
> >
> >
> > On Fri, Apr 18, 2014 at 4:03 PM, Sean Busbey <busbey@cloudera.com>
> wrote:
> >
> > > At the moment all of our logs about problems balancing are at DEBUG.
> > >
> > > Given the impact to a cluster when this happens (skewing load onto few
> > > servers, in some case severely), I'd like to raise it to WARN so that
> it
> > > surfaces for operators in the Monitor and in the non-debug log.
> > >
> > > Thought I'd do a quick lazy consensus check before filing a jira and
> > taking
> > > care of it.
> > >
> > > --
> > > Sean
> > >
> >
>
>
>
> --
> Sean
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message