maven-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikael ├ůsberg <m.asberg.wa...@gmail.com>
Subject Re: Failsafe: Killing self fork JVM. PING timeout elapsed.
Date Wed, 20 Mar 2019 08:39:57 GMT
These issues regarding communication with forked JVMs, won't they be
resolved once surefire moves to interprocess communication using
tcp/ip sockets? This happens to be the target feature to be included
in the next surefire 3.0.0 milestone:
https://issues.apache.org/jira/projects/SUREFIRE/versions/12344668

There are soooo many issues relating to surefire reading stdout of
forked processes (which is my understanding that it is currently
doing). Many of us are really looking forward to the next milestone.

On Tue, Mar 19, 2019 at 8:59 PM Jason Young <jason.young@procentive.com> wrote:
>
> Getting back to my original questions, I know that "ping" means to see if a
> process is there, and "NOOP" implies it's not a command to do anything. But
> what do the terms "ping" and "NOOP" mean in this context, i.e. how do the
> processes communicate? I assume they don't sonar. Do other processes also
> ping NOOPs? Can I PING Chrome with a NOOP from bash? Is it with TCP?
>
> I'm confused about what I should do regarding GC pauses. Previously I had
> code that would write the amount of remaining heap space (or something like
> that) to stdout after every test to troubleshoot OOMEs. Can writing to
> stdout cause the communication failure somehow?
>
> On Wed, Mar 13, 2019 at 5:57 PM Jason Young <jason.young@procentive.com>
> wrote:
>
> > I upgraded failsafe and surefire to 3.0.0-M3 as advised; we encountered
> > the same exception. (Still using -Xmx5g, will switch to OpenJ9 soon in case
> > that helps.)
> >
> > BTW I also asked on StackOverflow previously, for anyone interested:
> > https://stackoverflow.com/questions/54755846/killing-self-fork-jvm-ping-timeout-elapsed
> >
> > On Tue, Feb 26, 2019 at 6:40 PM Jason Young <jason.young@procentive.com>
> > wrote:
> >
> >> Thanks again for the information.
> >>
> >> We had increased the RAM to 3g some time ago to prevent OOMEs. More
> >> recently, I increased the RAM again to 5g for extra headroom since we had
> >> more headroom available; the problem hasn't happened since, but it hasn't
> >> been very long.
> >>
> >> We use a more customized image based on Alpine 3.8.2. The JDK and Maven
> >> are obtained via apk.
> >>
> >> I will try upgrading failsafe (and surefire while I'm at it) sooner, and
> >> probably do some experimentation with JVMs another time (not pressing for
> >> me ATM).
> >>
> >> On Tue, Feb 26, 2019 at 12:20 PM Tibor Digana <tibordigana@apache.org>
> >> wrote:
> >>
> >>> >> I'll try to enable some logging about GC pauses to see what's up
> >>>
> >>> Pls do not keep such setting after tuning the GC because this may
> >>> sometime
> >>> break the interprocess communication between Maven process and surefire
> >>> process.
> >>> It's worth to list GC information in a file and not in the console logs.
> >>> This can be configured, I guess.
> >>>
> >>> >> Do you think the value is simply too low?
> >>>
> >>> GCing many objects may take some time and I remember we had a user who
> >>> had
> >>> this problem a year or two ago.
> >>> We check every third NOOP (which is 3 x 10 sec) as a fix instead of every
> >>> NOP. So 30 seconds looked satisfactory.
> >>> I think you use old version 2.20 or something like that. The fixes for
> >>> docker have been done so far, so please use the latest version 3.0.0-M3.
> >>> See this page
> >>> https://maven.apache.org/surefire/maven-surefire-plugin/docker.html, we
> >>> used maven:3.5.3-jdk-8-alpine in this test. Which base image did you use?
> >>>
> >>> Cheers
> >>> Tibor
> >>>
> >>> On Tue, Feb 26, 2019 at 5:24 PM Jason Young <jason.young@procentive.com>
> >>> wrote:
> >>>
> >>> > Thanks for the information. It's good to see someone understands a
> >>> little
> >>> > about this.
> >>> >
> >>> > Incidentally, we have been looking at other GCs and VMs for the
> >>> application
> >>> > in production environments, so I'll look into how these affect tests
as
> >>> > well. I'll try to enable some logging about GC pauses to see what's
up.
> >>> >
> >>> > How would `-Xmx3g` cause long GC cycles? Do you think the value is
> >>> simply
> >>> > too low?
> >>> >
> >>> > FWIW we're running the Maven build in an Alpine-based Docker container.
> >>> >
> >>> > On Sat, Feb 23, 2019 at 6:36 AM Tibor Digana <tibordigana@apache.org>
> >>> > wrote:
> >>> >
> >>> > > Hi Jason,
> >>> > >
> >>> > > We spoke about this issue on our chat in ASF Slack:
> >>> > > "I think his tests have been paused for a long GC periods and
timed
> >>> out
> >>> > 3x
> >>> > > PING period = 30 seconds. After this period forked JVM supposed
the
> >>> Maven
> >>> > > process was killed by JenkinsCI and therefore all surefire processes
> >>> are
> >>> > > killed as well and all the file handlers and memory consumptions
are
> >>> > > freed."
> >>> > >
> >>> > > "But I have to say that `-Xmx3g` may cause long GC cycles, see
> >>> > >
> >>> > >
> >>> >
> >>> https://maven.apache.org/surefire/maven-surefire-plugin/examples/shutdown.html
> >>> > > "
> >>> > >
> >>> > > You are using java-1.8-openjdk. I guess you should use Shenandoah
GC
> >>> > which
> >>> > > is an experimental algorithm in  JVM 1.8. This would significantly
> >>> short
> >>> > > the GC cycles.
> >>> > >
> >>> > > We should of cource provide a new configuration parameter to give
> >>> you a
> >>> > > chance to prolong the PING.
> >>> > >
> >>> > > Cheers
> >>> > > Tibor
> >>> > >
> >>> >
> >>> >
> >>> > --
> >>> >
> >>> > Jason Young
> >>> >
> >>>
> >>
> >>
>
> --
> Jason Young
> Software Engineer | PROCENTIVE
> [image: Phone] 715 245 8000 x7609
> [image: Mobile] 706 870 3540
> [image: Web] procentive.com
> Confidentiality Notice: This message is intended for the sole use of the
> individual and entity to which it is addressed, and may contain information
> that is privileged, confidential and exempt from disclosure under
> applicable law. Any unauthorized review, use, disclosure or distribution of
> this email message, including any attachment, is prohibited. If you are not
> the intended recipient, please advise the sender by reply email and destroy
> all copies of the original message.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Mime
View raw message