tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Craig Lynch <rcraigly...@gmail.com>
Subject Tomcat endpoints are becoming extremely unresponsive
Date Mon, 20 Jun 2016 19:58:31 GMT
We run embedded tomcat on version 8, and for some reason are consistently
seeing extreme slowness across all Tomcat endpoints at very consistent
intervals of three hours. Once a site gets into the slow state, it is never
able to recover, and stays unresponsive (requests take tens of minutes to
hours) until the service is manually restarted.



There are no resource issues that I've been able to detect (heap seems
fine, no apparent memory leaks, cpu is fine, network/db connections aren't
exhausted, etc). Tomcat does seem to receive the requests, but for some
reason does not seem to be processing them.



There is an exception that occurs right around the time the service goes
into a bad state, which is the reason I believe this to be a Tomcat issue.
The stack trace is as follows:



Exception in thread "mc-26" java.lang.IllegalMonitorStateException



    at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(Unknown
Source)



    at
java.util.concurrent.locks.AbstractQueuedSynchronizer.release(Unknown
Source)



    at java.util.concurrent.locks.ReentrantLock.unlock(Unknown Source)



    at java.util.concurrent.LinkedBlockingQueue.take(Unknown Source)



    at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:103)



    at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:31)



    at java.util.concurrent.ThreadPoolExecutor.getTask(Unknown Source)



    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)



    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)



    at
org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)



    at java.lang.Thread.run(Unknown Source)



Having done some additional testing, I can elaborate on the state of things
when this exception occurs:

* The ReentrantLock that is throwing the exception is meant to throw the
IllegalMonitorStateException when a thread other than the thread that
created the lock tries to release it. Interestingly, when tryRelease() is
called, the owning thread is actually null, which means that the lock isn't
currently taken or owned by anyone.



* The lock's state is 0, which is consistent with the lock's owning thread
being null.



* The tryRelease funtion takes an int argument "releases", which is 1
(you'd find that in the stacktrace anyway, since it's passed in as a
constant further up, but mentioning it might save some time).



I realize that the LinkedBlockingQueue and ReentrantLock are java core
concurrency libraries, but it seems like Tomcat is getting into a bad state
once the error occurs, and is unable to exit the bad state. The executing
thread that gets this exception dies, and almost all other threads end up
staying almost all the time in an unsafe parked state. The
IllegalMonitorStateException also generally occurs in several threads after
it's shown up for the first time.



I'm not sure how to describe how to reproduce this issue, other than saying
that everyone at my company with our service installed experiences it very
reliably every three hours. We've been thus far unable to determine what
causes it, however. My personal theory is that somehow the TaskQueue is
getting into a state where it can only rarely give tasks to the executor
threads, but I don't know what causes things to enter that state. From what
we can tell, it does seem to be related to our Jersey endpoints.



If you need any additional information, just let me know and I'll be happy
to provide anything that might be useful. I've been getting most of my
information from thread/heap dumps as well as modifying local versions of
Tomcat to provide additional logging.



Most of our services are running Tomcat 8.0.32, but seem to still exhibit
the problem on versions at least as early as 8.0.15 and as late as 8.0.34.



I submitted this as a bug (59737), which Mark Thomas immediately closed
saying that there’s nothing to indicate that this is a Tomcat bug. I’m
happy to provide any additional information that might be useful, most of
I’ve looked at so far seems to point to Tomcat, though I would be ecstatic
if someone could point me to where I’m messing up J

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message