zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andor Molnar <an...@cloudera.com>
Subject Re: [SUGGESTION] JvmPauseMonitor in ZooKeeper
Date Wed, 09 May 2018 16:36:23 GMT
+1 cool!


On Wed, May 9, 2018 at 7:59 AM, Norbert Kalmar <nkalmar@cloudera.com> wrote:

> Okay, thanks Ed, I created the Jira, will look into it soon :)
> https://issues.apache.org/jira/browse/ZOOKEEPER-3037
>
> Regards,
> Norbert
>
> On Wed, May 9, 2018 at 4:44 PM Edward Ribeiro <edward.ribeiro@gmail.com>
> wrote:
>
> > +1. Sounds really nice to have feature. Let's open a ticket and open a
> PR.
> > :)
> >
> > Ed
> >
> > Em qua, 9 de mai de 2018 11:15, Norbert Kalmar <nkalmar@cloudera.com>
> > escreveu:
> >
> > > Hi,
> > >
> > > I just got a tip that we could improve on the logging in ZooKeeper.
> > After a
> > > ZK crash, or client timeout sometimes it's hard to determine from the
> > logs
> > > what happened. Knowing if ZK was responsive at the time would help a
> lot.
> > > For example, ZK might spend a lot of time waiting on GC (there is still
> > > some misconception that ZK is a storage).
> > >
> > > To help detect this, HADOOP already has a great tool called JVM Pause
> > > Monitor. (As the name suggest, it can be also used for monitoring, but
> it
> > > also helps post-mortem in a lot of cases). Basically it has a daemon
> that
> > > sleeps for one second, and if the sleep time exceeds the 1s by more
> than
> > > the threshold (1s: INFO, 10s: WARN by default - this can be
> configurable
> > in
> > > our case, see below), it will alert/make a log entry. It can also
> monitor
> > > the time GC took.
> > >
> > > Now, this class is in the HADOOP-common. I wouldn't want to depend on
> > > Hadoop-common because of this one feature/class (it is actually a
> single
> > > class). Since this is a straightforward implementation, and in the past
> > > five years the few commits it had is nothing really serious, I think we
> > > could just copy this class in ZooKeeper, and introduce it as a
> > configurable
> > > feature, by default it can be off.
> > >
> > > The class:
> > >
> > >
> > https://github.com/apache/hadoop/blob/trunk/hadoop-
> common-project/hadoop-common/src/main/java/org/apache/
> hadoop/util/JvmPauseMonitor.java
> > >
> > > What do You think?
> > >
> > > Regards,
> > > Norbert
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message