accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Drob <mad...@cloudera.com>
Subject Re: increasing balancing problems to WARN
Date Mon, 21 Apr 2014 13:42:45 GMT
Can you elaborate a bit more on this, Bill?


On Sat, Apr 19, 2014 at 12:03 AM, William Slacum <
wilhelm.von.cloud@accumulo.net> wrote:

> We could consider the use of markers to throw in more metadata about the
> relevance of a particular log message.
>
>
> On Fri, Apr 18, 2014 at 10:46 PM, Sean Busbey <busbey@cloudera.com> wrote:
>
> > I also try to limit what goes at higher warning levels.  One of my goals
> > over hte next few months is to improve our current logging. It sounds
> like
> > this is a good time to make sure we're on the same page.
> >
> > We're going to have to train users on something (esp since our currently
> > logging is very noisy). The short version I like is "Info and more severe
> > are for operators; info and less severe are for developers."
> >
> > Here's what I usually use as a guideline (constrained to slf4j levels):
> >
> >
> > = ERROR
> >
> > Something is wrong and an operator needs to do something, preferably very
> > soon. In other words, if I was on call I'd expect to get paged.
> >
> > = WARN
> >
> > Something is amiss, but not of immediate concern. An operator who is on
> > call but not busy at the moment might want to investigate some kind of
> > underlying issue, but the system will continue to function within some
> > reasonable bound.
> >
> > = INFO
> >
> > Summary information about normal operations that is safe to ignore. GC
> > information, throughput stats, that kind of thing.
> >
> > = DEBUG
> >
> > Low level information that is not normally useful, but will help
> determine
> > the cause of a system malfunction. Usually something a developer or tier
> 3
> > supporter would want when something was going wrong (e.g. stack traces).
> >
> > = TRACE
> >
> > Detailed low level information at a volume that probably can't be
> gathered
> > in production.
> >
> >
> > Eric, do those all sound reasonable? I want to make sure we have a common
> > basis before I get into the specifics of this case.
> >
> > -Sean
> >
> > On Fri, Apr 18, 2014 at 8:21 PM, Eric Newton <eric.newton@gmail.com>
> > wrote:
> >
> > > -1
> > >
> > > I would hesitate to put *any* message at WARN. It is normal for
> balancing
> > > to take a little while, especially for some of my users who have their
> > own
> > > balancing algorithm.
> > >
> > > Users feel the need to fix the problem; after all, it's there in big
> > scary
> > > yellow on the monitor page.   I don't like training users to ignore
> scary
> > > yellow.  Is it a problem, or not?
> > >
> > > Alternatively, put the balance info into the master status, and display
> > it.
> > >  Like GC collection time... hey, I've been migrating these tablets for
> a
> > > long time... turn yellow/red.
> > >
> > > -Eric
> > >
> > >
> > >
> > >
> > > On Fri, Apr 18, 2014 at 4:03 PM, Sean Busbey <busbey@cloudera.com>
> > wrote:
> > >
> > > > At the moment all of our logs about problems balancing are at DEBUG.
> > > >
> > > > Given the impact to a cluster when this happens (skewing load onto
> few
> > > > servers, in some case severely), I'd like to raise it to WARN so that
> > it
> > > > surfaces for operators in the Monitor and in the non-debug log.
> > > >
> > > > Thought I'd do a quick lazy consensus check before filing a jira and
> > > taking
> > > > care of it.
> > > >
> > > > --
> > > > Sean
> > > >
> > >
> >
> >
> >
> > --
> > Sean
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message