accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Newton <eric.new...@gmail.com>
Subject Re: Review Request 27654: Add introspection of long running assignments
Date Fri, 07 Nov 2014 00:29:29 GMT
It would be nice to model "Danger!" messages with "All Clear!" directly.

I'll make a ticket.

On Thu, Nov 6, 2014 at 3:47 PM, Josh Elser <josh.elser@gmail.com> wrote:

>
>
> > On Nov. 6, 2014, 5:47 p.m., kturner wrote:
> > >
> server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java,
> line 250
> > > <
> https://reviews.apache.org/r/27654/diff/3/?file=751140#file751140line250>
> > >
> > >     The compaction code remembers when it logged an exception and does
> not do it again.   It also logs a message if the compaction becomes
> unstuck.  An advantage I thought of w/ repeatedly logging, is that you
> could see the stack trace changing (or not).
> > >
> > >
> > >     The stack trace is  a possible trace.  By the time logging
> happens, the assignment could have completed and the thread could have
> moved on to other things.
> >
> > Josh Elser wrote:
> >     Yeah, since these are running fairly regularly (order of seconds) a
> stuck assignment could get really spammy. Like you point out, there could
> be value gained from printing out the stack more than once. Maybe I could
> add some backoff which only warns so often?
> >
> >     bq. By the time logging happens, the assignment could have completed
> and the thread could have moved on to other things.
> >
> >     Do you think the message should be updated to be more clear about
> this? A "Maybe you should look into this" type message?
> >
> > kturner wrote:
> >     > a stuck assignment could get really spammy
> >
> >     I think that spam is probably ok as long as the default is high
> enough such that when it does happen, its something to be concerned about.
> Could make the timer check a little less frequently.
> >
> >     > Do you think the message should be updated to be more clear about
> this?
> >
> >     I think compaction code just says its a possible stack trace.   I
> suppose a good solution would be to have error codes, then user can look up
> error code and get nitty gritty details.  Can't really put too much info in
> log message.
> >
> > Josh Elser wrote:
> >     bq. Could make the timer check a little less frequently.
> >
> >     As long as we have a long threshold for warning about a stuck
> assignment, we can easily make a longer period on the timer. The timer
> period dictates the minimum stuck assignment time -- I can update the
> description with a clarification.
> >
> > kturner wrote:
> >     I was thinking that once an assignment is considered stuck, that
> each time the timer kicks a check (I think its either 5 secs or 1 sec, not
> sure) that it will cause a spam.  Was thinking this could be increased to
> produce less spam.  The period of the timer could be a function of
> tserver.assignment.duration.warning, like 1/4 or 1/2.
>
> bq. The period of the timer could be a function of
> tserver.assignment.duration.warning, like 1/4 or 1/2.
>
> That would work, unless the user changed the value of the duration
> warning. It would still fire at the old period (unless I'm much trickier
> about scheduling the task to run).
>
> Regardless need to think some more about preventing spam.
>
>
> - Josh
>
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27654/#review60185
> -----------------------------------------------------------
>
>
> On Nov. 6, 2014, 12:58 a.m., Josh Elser wrote:
> >
> > -----------------------------------------------------------
> > This is an automatically generated e-mail. To reply, visit:
> > https://reviews.apache.org/r/27654/
> > -----------------------------------------------------------
> >
> > (Updated Nov. 6, 2014, 12:58 a.m.)
> >
> >
> > Review request for accumulo.
> >
> >
> > Bugs: ACCUMULO-3304
> >     https://issues.apache.org/jira/browse/ACCUMULO-3304
> >
> >
> > Repository: accumulo
> >
> >
> > Description
> > -------
> >
> > Watches assignments and reports when an assignment is running for longer
> than a configured time.
> >
> >
> > Diffs
> > -----
> >
> >   core/src/main/java/org/apache/accumulo/core/conf/Property.java 56f3d9c
> >
>  server/tserver/src/main/java/org/apache/accumulo/tserver/ActiveAssignmentRunnable.java
> PRE-CREATION
> >
>  server/tserver/src/main/java/org/apache/accumulo/tserver/RunnableStartedAt.java
> PRE-CREATION
> >
>  server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServer.java
> 94be0bb
> >
>  server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServerResourceManager.java
> 935ffeb
> >
> > Diff: https://reviews.apache.org/r/27654/diff/
> >
> >
> > Testing
> > -------
> >
> > Very minimal.
> >
> >
> > Thanks,
> >
> > Josh Elser
> >
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message