www-builds mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer ...@effectivemachines.com.INVALID>
Subject Please pick up after yourself
Date Fri, 21 Dec 2018 21:53:04 GMT

I’m now at 4 times this week where my build job has landed on a node that has broken JVM
tasks hanging about from surefire tests gone awry.  (Culprits: Accumulo, Reef, and Sling.)
Due to the way Linux does process limits on systemd-based boxes, even though there is plenty
of CPU and memory, my tasks are getting killed because all of these surefire tests have spawned
enough threads that everything else fails.

Folks:  please, if you aren’t running in a docker container (which makes it extremely easy
to clean as well as enforce a sub-5k process limit), please add a Post Action on your Jenkins
job to blow away your tasks that are still hanging around. 

At this point, I feel like I have no choice but to just start nuking any long running java
processes (-agent/slave.jar and the datadog stuff that infra runs) before startup just so
I can get a build. :(

View raw message