tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Eggers <its_toas...@yahoo.com>
Subject Re: [EXTERNAL] Re: Re: General Architecture Question for multiple websites on a single RedHat server
Date Tue, 10 Jul 2012 18:29:57 GMT
----- Original Message -----

> From: "Simon, Leonard" <leonard.simon@hsn.net>
> To: Tomcat Users List <users@tomcat.apache.org>
> Cc: 
> Sent: Tuesday, July 10, 2012 9:54 AM
> Subject: Re: [EXTERNAL] Re: Re: General Architecture Question for multiple websites on
a single RedHat server
> 
> Chris,
> 
> Thanks for looking at this.
> 
> Tomcat version is 6.0.32.
> mod_jk is at 1.2.31
> 
> 
> Someone else did the thread dump so I'm assuming they did it on the right
> process.
> 
> On Tue, Jul 10, 2012 at 12:19 PM, Christopher Schultz <
> chris@christopherschultz.net> wrote:
> 
>>  -----BEGIN PGP SIGNED MESSAGE-----
>>  Hash: SHA1
>> 
>>  Simon,
>> 
>>  On 7/9/12 4:24 PM, Simon, Leonard wrote:
>>  > Well our Tomcat went out to lunch again and we had to recycle the
>>  > webserver to get things stablized. By this I mean we get reports
>>  > from the users that screens become unresponsive and looking at a
>>  > top we see tomcat process taking 100% CPU.
>> 
>>  Are you sure this is the right process?
>> 
>>  > Was able to do a thread dump captured with a kill -3 PID and here
>>  > it is if anyone is so inclined to comment on it.
>> 
>>  This thread dump shows a mostly-idle server with the exception of
>>  those threads in socketAccept() (not sure why these count as RUNNABLE
>>  when they are really blocking) and those executing reads from the
>>  client connection(s).
>> 
>>  What exact version of Tomcat are you using, and what version of mod_jk
>>  (or, if you are using mox_proxy_ajp, what httpd version)? IIRC, there
>>  have been some stability improvements in recent Tomcat versions around
>>  the worker threads being returned to their associated connectors.
>> 
>>  - -chris


I didn't see much in the way that rang immediate alarm bells. It looks like you're processing
about 18 client connections, and everything else is pretty quiet. These client connections
are going through the AJP connector (as you've noted in your reply above).

A few things though:

As someone in this thread has already mentioned, permgen is pretty full. You might try increasing
that with -XX:MaxPermSize=128m.

There are a lot of garbage collection threads. You can see this on a multi-core system. From
digging around, it appears that the number of parallel garbage collection threads follows
this formula:

8 + (5/8)X = GCT

You get one GCT (garbage collection thread) per core for the first 8 cores, and then 5/8 of
a thread for every core after that. So in your case:

8 + (5/8)X = 18
X = 16

This means that your system has 24 cores. Are you running on a 24 core system, or have you
tuned garbage collection with JVM arguments.

In general, if you're not running into GC issues, tuning GC parameters is counter-productive.
If you do have to tune GC parameters, lots of testing is in order.

I noticed that you also have an MQ Trace monitor running. Are you using MQ? Directly accessing
an MQ service without going through a pool configured for graceful restarts / retries can
cause a system to become unresponsive. However, I don't see any evidence of that in this thread
dump.

As I've said off line, it's really difficult to see what's consuming CPU from one thread dump.
Here's how to start figuring out what is going on with your system.

1. Keep access logs

If you don't, then start. You'll want the access logs to replay on a test environment to see
if you can recreate the problem. JMeter is a good tool for replaying information from access
logs.

2. When the problem occurs

a. Multiple thread dumps, about 5 seconds apart. Use a tool like jstack so it's scriptable

   jstack -l [process-id]
   where [process-id] is the process id of the distressed Tomcat

The -l generates a long listing, and may not be necessary. You'll need to have the right permissions
(either root or the user running the JVM being targeted with the process id).

b. At the same time use something like the following to see which thread is consuming CPU:

   ps -L -o pcpu,lwp -p [process-id]
   where [process-id] is the process id of the distressed Tomcat

This will show all the threads of the process, the percentage of CPU used for each thread,
and the thread process ID. You can then correlate the thread process ID with the thread dump
to see exactly what is consuming the CPU.

This will generate tons of output, so it's best to put both in a script and direct the output
to files.

Now you'll end up with the following:

1. What requests were being made of your server when the problem occurred
2. Multiple thread dumps while the problem is occurring
3. The identity of the thread (or threads) that is consuming the CPU

Once you get this information, you'll be in a much better position to determine what is causing
your problems.

. . . . just my two cents.
/mde/

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Mime
View raw message