ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitry Pavlov <dpavlov....@gmail.com>
Subject Re: Facility to detect long STW pauses and other system response degradations
Date Mon, 20 Nov 2017 14:30:07 GMT
Yes, we need some timestamp from Java code. But I think JVM thread could
update TS with delays not related to GC and will have same effect with
IgniteUtils#currentTimeMillis().

Is this new test compares result from java timestamps difference with GC
logs?

пн, 20 нояб. 2017 г. в 16:39, Anton Vinogradov <avinogradov@gridgain.com>:

> Dmitriy,
>
> > Sleeping Java Thread IMO is not an option, because thread can be in
> > Timed_Watiting logner than timeout.
>
> That's the only one idea we have, and, according to tests, it works!
>
> > Did I understand correctly that the native stream is proposed? And our
> goal
> > now is to select best framework for this?
>
> That's one of possible cases.
> We can replace native thread by another JVM, this should solve
> compatibility issues.
>
>
> On Mon, Nov 20, 2017 at 4:24 PM, Dmitry Pavlov <dpavlov.spb@gmail.com>
> wrote:
>
> > Sleeping Java Thread IMO is not an option, because thread can be in
> > Timed_Watiting logner than timeout.
> >
> > Did I understand correctly that the native stream is proposed? And our
> goal
> > now is to select best framework for this?
> >
> > Can we limit this oppotunity with several popular OS (Win,Linux), and do
> > not implement this feature for all operation systems?
> >
> >
> > пн, 20 нояб. 2017 г. в 14:55, Anton Vinogradov <avinogradov@gridgain.com
> >:
> >
> > > Igniters,
> > >
> > > Since no one rejected proposal, let's start from part one.
> > >
> > > > I propose to add a special thread that will record current time
> every N
> > > > milliseconds and check the difference with the latest recorded value.
> > > > The maximum and total pause values for a certain period can be
> > published
> > > in
> > > > the special metrics available through JMX.
> > >
> > > On Fri, Nov 17, 2017 at 4:08 PM, Dmitriy_Sorokin <
> > > sbt.sorokin.dvl@gmail.com>
> > > wrote:
> > >
> > > > Hi, Igniters!
> > > >
> > > > This discussion thread related to
> > > > https://issues.apache.org/jira/browse/IGNITE-6171.
> > > >
> > > > Currently there are no JVM performance monitoring tools in AI, for
> > > example
> > > > the impact of GC (eg STW) on the operation of the node. I think we
> > should
> > > > add this functionality.
> > > >
> > > > 1) It is useful to know that STW duration increased or any other
> > > situations
> > > > leads to similar consequences.
> > > > This will allow system administrators to solve issues prior they
> become
> > > > problems.
> > > >
> > > > I propose to add a special thread that will record current time
> every N
> > > > milliseconds and check the difference with the latest recorded value.
> > > > The maximum and total pause values for a certain period can be
> > published
> > > in
> > > > the special metrics available through JMX.
> > > >
> > > > 2) If the pause reaches a critical value, we need to stop the node,
> > > without
> > > > waiting for end of the pause.
> > > >
> > > > The thread (from the first part of the proposed solution) is able to
> > > > estimate the pause duration, but only after its completion.
> > > > So, we need an external thread (in another JVM or native) that is
> able
> > to
> > > > recognize that the pause duration has passed the critical mark.
> > > >
> > > > We can estimate (STW or similar) pause duration by
> > > >  a) reading value updated by the first thread, somehow (eg via JMX,
> > shmem
> > > > or
> > > > shared file)
> > > >  or
> > > >  b) by using JVM diagnostic tools. Does anybody know crossplatform
> > > > solutions?
> > > >
> > > > Feel free to suggest ideas or tips, especially about second part of
> > > > proposal.
> > > >
> > > > Thoughts?
> > > >
> > > >
> > > >
> > > > --
> > > > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message