tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jesse Barnum <jsb_tom...@360works.com>
Subject Re: What is the best connector configuration for thousands of mostly idle users?
Date Fri, 07 Feb 2014 17:26:09 GMT
On Feb 7, 2014, at 2:38 AM, Mark Thomas <markt@apache.org> wrote:

> Jesse Barnum <jsb_tomcat@360works.com> wrote:
> 
> Thanks for such a well written question. All the relevant information is available and
presented clearly and logically.

Glad I could help. I get error reports from my users all the time like "I installed the update
and now it doesn't work", so I know how frustrating that can be :-)

> 
>> Problem summary:
>> My nio polling threads are using too much CPU time.
> 
> Are you sure that is the real problem? It sounds like the occasional slowness is the
problem and the polling threads are the suspected - emphasis on the suspected - root cause.

That's true. I've tried running it with a profiler (YourKit) during peak time, and I think
I've found the real culprit - reading from the Coyote InputStream is blocking.

Out of 98,515 milliseconds spent in javax.servlet.http.HttpServlet.service:
* 95,801 milliseconds are spent calling org.apache.catalina.connector.CoyoteInputStream.read(
byte[] ). Of this time, 81,707 ms are spent waiting for KeyAttachment.awaitLatch.
* 2,698 milliseconds are spent on business logic and MySQL calls
* 16 milliseconds are spent closing the input stream and setting HTTP headers

The last Tomcat code I see is org.apache.tomcat.util.net.NioEndpoint$KeyAttachment.awaitLatch(
CountDownLatch, long, TimeUnit ). I don't really know what that means or does, but it seems
to be the holdup.

If you want me to send you the raw profiler information, let me know how to send it and I'll
be happy to (although I'm running an old version of YourKit, 9.5.6). I can also send screen
shots if that's helpful / easier.

Here is a typical thread dump while the request is being serviced:
> Thread name: ajp-nio-8009-exec-19 / ID: 55 / state: TIMED_WAITING 
> Milliseconds spent waiting: 47,475,371 / blocked: 2,389 
>     sun.misc.Unsafe.$$YJP$$park(Native Method) 
>     sun.misc.Unsafe.park(Unsafe.java) 
>     java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) 
>     java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1033)

>     java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)

>     java.util.concurrent.CountDownLatch.await(CountDownLatch.java:282) 
>     org.apache.tomcat.util.net.NioEndpoint$KeyAttachment.awaitLatch(NioEndpoint.java:1571)

>     org.apache.tomcat.util.net.NioEndpoint$KeyAttachment.awaitReadLatch(NioEndpoint.java:1573)

>     org.apache.tomcat.util.net.NioBlockingSelector.read(NioBlockingSelector.java:172)

>     org.apache.tomcat.util.net.NioSelectorPool.read(NioSelectorPool.java:246) 
>     org.apache.tomcat.util.net.NioSelectorPool.read(NioSelectorPool.java:227) 
>     org.apache.coyote.ajp.AjpNioProcessor.readSocket(AjpNioProcessor.java:342) 
>     org.apache.coyote.ajp.AjpNioProcessor.read(AjpNioProcessor.java:314) 
>     org.apache.coyote.ajp.AjpNioProcessor.readMessage(AjpNioProcessor.java:406) 
>     org.apache.coyote.ajp.AjpNioProcessor.receive(AjpNioProcessor.java:375) 
>     org.apache.coyote.ajp.AbstractAjpProcessor.refillReadBuffer(AbstractAjpProcessor.java:616)

>     org.apache.coyote.ajp.AbstractAjpProcessor$SocketInputBuffer.doRead(AbstractAjpProcessor.java:1070)

>     org.apache.coyote.Request.doRead(Request.java:422) 
>     org.apache.catalina.connector.InputBuffer.realReadBytes(InputBuffer.java:290) 
>     org.apache.tomcat.util.buf.ByteChunk.substract(ByteChunk.java:431) 
>     org.apache.catalina.connector.InputBuffer.read(InputBuffer.java:315) 
>     org.apache.catalina.connector.CoyoteInputStream.read(CoyoteInputStream.java:167)

>     com.prosc.io.IOUtils.writeInputToOutput(IOUtils.java:49) 
>     com.prosc.io.IOUtils.inputStreamAsBytes(IOUtils.java:116) 
>     com.prosc.io.IOUtils.inputStreamAsString(IOUtils.java:136) 
>     com.prosc.io.IOUtils.inputStreamAsString(IOUtils.java:127) 
>     com.prosc.licensecheck.LicenseCheck.doPost(LicenseCheck.java:169) 
>     javax.servlet.http.HttpServlet.service(HttpServlet.java:647) 
>     javax.servlet.http.HttpServlet.service(HttpServlet.java:728) 
>     org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)

>     org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)

>     com.prosc.infrastructure.LogFilter.doFilter(LogFilter.java:22) 
>     org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)

>     org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)

>     com.prosc.infrastructure.CharacterEncodingFilter.doFilter(CharacterEncodingFilter.java:38)

>     org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)

>     org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)

>     org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)

>     org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)

>     org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472)

>     org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) 
>     org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99) 
>     org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)

>     org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407) 
>     org.apache.coyote.ajp.AjpNioProcessor.process(AjpNioProcessor.java:184) 
>     org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)

>     org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1680)

>     java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
>     java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
>     java.lang.Thread.run(Thread.java:722)

So it seems that the worker thread is blocking on something while it's trying to read the
POST data…? If my guess is correct, is this a problem that can be solved by reconfiguring?

> 
>> Application overview:
>> My application has from 1,300 - 4,000 users connected at any given
>> time. Each user sends about 200 bytes, then waits 30 seconds, then
>> sends about 200 bytes, and this just loops for each user.
>> Each user connects with SSL, and we use a long keepalive to ensure that
>> the HTTP connection doesn't close, so that we don't have to renegotiate
>> SSL.
> 
> How long should the application take to process those 200 bytes? I'm wondering what the
peak number of concurrent requests, rather than connections, might be.


You're welcome to take a look at the live Apache server status here:
http://venus.360works.com/server-status

At 7:30am this morning, it showed 1,733 connected users, and looking at the 'Req' column which
measures response time in milliseconds, it's typically in the 0-7 millisecond range, with
some rare outliers in the hundreds of milliseconds.

At 12:07pm today, which is our peak time, for every 3-4 requests in the 0-8 millisecond range,
there is a request in the 150-700 millisecond range. 

>> 
>> Problem detail:
>> lsof is currently showing 564 open sockets between Apache and Tomcat on
>> port 8009, with 1,352 users connected to Apache.
>> The two threads consuming the most CPU time in Tomcat are
>> "NioBlockingSelector.BlockPoller-2 / 15" and
>> "ajp-nio-8009-ClientPoller-0 / 25". Between them, they are taking 20%
>> of all CPU time for the Tomcat process. I get a few times a day when
>> our monitoring software reports slow response times, and I'd like to
>> solve this.
> 
> How much actual processor time are those threads taking? The 20% is only relative and
it would be helpful to know what it is relative to.


The profiler is showing the selector threads as taking 264 milliseconds during the profiled
period of time (1 minute, 26 seconds), so I agree with you; that's not the culprit.

> 
> Do you have access logs available for those times? The number of concurrent requests
would be a useful number. When working that out you need to be sure if the time in the access
log is the time the request started or the time it finished.
> 
>> Some guesses at solutions:
>> I'm guessing that the high CPU usage is because they are polling all
>> 564 open sockets constantly? Would it make sense to reduce the number
>> of open sockets? I didn't configure any maximum and I don't know how to
>> reduce this number. I'm also concerned that reducing that might negate
>> any benefits by increasing the number of sockets opening and closing
>> between ajp_mod_proxy and the NIO AJP connector.
> 
> Some rough numbers.
> With a peak of 9 httpd children with 500 threads each - assume at least 4000 connected
clients.
> With a request every 30 seconds that is roughly 133 requests a second assuming they are
perfectly evenly distributed which they won't be.
> With 15 worker threads on Tomcat each one will be handling roughly 9 requests a second,
again assuming even distribution.
> That means a request has about 110ms to complete or you run the risk of running out of
threads. This includes the time to process the request and a little overhead for Tomcat to
recycle the thread.
> I wouldn't be surprised for peak loads to be at least 2-3 times higher due to the uneven
distribution of requests in time. That means requests have more like 35-40ms to complete.

That sounds about right. The Apache status page is currently showing 153 requests per second,
with 2,912 connected users.

> 
> I suspect that you are running out of worker threads on the Tomcat side. Increasing it
from 15 to 30 wouldn't do any harm.

This is a single core box (sorry, should have mentioned that in the configuration details).
Would you still expect increasing the worker thread count to help?

> 
> Of course, this is just guess work. You'd need to look at your access logs to be sure.
> 
>> Maybe it's already running at optimal performance and I just need to
>> throw hardware at it, but it seems like a solvable problem, because the
>> actual worker threads are not doing much at all.
> 
> I agree it sounds like configuration at this point.
> 
> Mark
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
> 


--Jesse Barnum, President, 360Works
http://www.360works.com
Product updates and news on http://facebook.com/360Works
(770) 234-9293
== Don't lose your data! http://360works.com/safetynet/ for FileMaker Server ==
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Mime
View raw message