zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: [SUGGESTION] JvmPauseMonitor in ZooKeeper
Date Wed, 09 May 2018 17:56:21 GMT
Do you know why they did this rather than just enabling GC logging by
default? Why re-invent the wheel?

I seem to remember seeing a push do enable GC logging by default a few
years ago. In particular around the time when the JVM added GC log rolling
as a feature. Here's an example:
https://batmat.net/2016/10/17/always-enable-gc-logs-and-how-to-enable-logs-rotation-with-hotspot/
My understanding is that the overhead is so low that it's feasible to do
this.

Good improvement though regardless which way we go.

Regards,

Patrick

On Wed, May 9, 2018 at 9:36 AM, Andor Molnar <andor@cloudera.com> wrote:

> +1 cool!
>
>
> On Wed, May 9, 2018 at 7:59 AM, Norbert Kalmar <nkalmar@cloudera.com>
> wrote:
>
> > Okay, thanks Ed, I created the Jira, will look into it soon :)
> > https://issues.apache.org/jira/browse/ZOOKEEPER-3037
> >
> > Regards,
> > Norbert
> >
> > On Wed, May 9, 2018 at 4:44 PM Edward Ribeiro <edward.ribeiro@gmail.com>
> > wrote:
> >
> > > +1. Sounds really nice to have feature. Let's open a ticket and open a
> > PR.
> > > :)
> > >
> > > Ed
> > >
> > > Em qua, 9 de mai de 2018 11:15, Norbert Kalmar <nkalmar@cloudera.com>
> > > escreveu:
> > >
> > > > Hi,
> > > >
> > > > I just got a tip that we could improve on the logging in ZooKeeper.
> > > After a
> > > > ZK crash, or client timeout sometimes it's hard to determine from the
> > > logs
> > > > what happened. Knowing if ZK was responsive at the time would help a
> > lot.
> > > > For example, ZK might spend a lot of time waiting on GC (there is
> still
> > > > some misconception that ZK is a storage).
> > > >
> > > > To help detect this, HADOOP already has a great tool called JVM Pause
> > > > Monitor. (As the name suggest, it can be also used for monitoring,
> but
> > it
> > > > also helps post-mortem in a lot of cases). Basically it has a daemon
> > that
> > > > sleeps for one second, and if the sleep time exceeds the 1s by more
> > than
> > > > the threshold (1s: INFO, 10s: WARN by default - this can be
> > configurable
> > > in
> > > > our case, see below), it will alert/make a log entry. It can also
> > monitor
> > > > the time GC took.
> > > >
> > > > Now, this class is in the HADOOP-common. I wouldn't want to depend on
> > > > Hadoop-common because of this one feature/class (it is actually a
> > single
> > > > class). Since this is a straightforward implementation, and in the
> past
> > > > five years the few commits it had is nothing really serious, I think
> we
> > > > could just copy this class in ZooKeeper, and introduce it as a
> > > configurable
> > > > feature, by default it can be off.
> > > >
> > > > The class:
> > > >
> > > >
> > > https://github.com/apache/hadoop/blob/trunk/hadoop-
> > common-project/hadoop-common/src/main/java/org/apache/
> > hadoop/util/JvmPauseMonitor.java
> > > >
> > > > What do You think?
> > > >
> > > > Regards,
> > > > Norbert
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message