tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shanti Suresh <sha...@umich.edu>
Subject Re: high CPU usage on tomcat 7
Date Wed, 03 Oct 2012 15:10:25 GMT
Well,

Here it is:

----------------start_jconsole.sh-----------
#!/bin/sh

export DISPLAY

DISPLAY_NUMBER=`echo $DISPLAY | /usr/bin/perl -n -e 'm#localhost:(\d+).0#
&& print $1'`
echo $DISPLAY_NUMBER

XAUTH_ENTRY=`/usr/bin/xauth list | grep "unix:$DISPLAY_NUMBER"`
sudo su webinf -c "xauth add $XAUTH_ENTRY; /usr/bin/jconsole
-J-Djava.class.path=/usr/lib/jvm/java/lib/jconsole.jar:/usr/lib/jvm/java
/lib/tools.jar"

----------------------------------------------------


On Wed, Oct 3, 2012 at 11:07 AM, Shanti Suresh <shanti@umich.edu> wrote:

> Hi Kirill,
>
> I am amazed at the size of the Java Heap of 26 GB.  I didn't expect that
> big a size.  Before I suggest anything else, please take a look at the
> filesystem space on "/tmp".  The script creates a heapdump everytime it
> runs as well.  So please make sure that "/tmp/" has sufficient storage.
> Otherwise it may bring your service down.
>
> (1) Could you please paste your JVM command-line arguments here?
> (2) Also, did you have a chance to observe the JConsole graphs?   I have
> attached a sample JConsole graph here.
> (3) What JDK version are you using?
> (4) GC for a Heap of that size may need some tuning.
>
> You may start JConsole as follows on Linux/Unix.
> a.Place the following script in your home-directory on the Tomcat
> app-server machine .
> b. Make sure that you have enabled Tomcat to allow JMX access.
> c. Then log-in from a remote laptop/desktop into the server (using ssh),
> and invoke this script from your home-directory:
> --------start_jconsole.sh------
> DISPLAY_NUMBER=`echo $DISPLAY | /usr/bin/perl -n -e 'm#localhost:(\d+).0#
> && print $1'`
> echo $DISPLAY_NUMBER
> XAUTH_ENTRY=`/usr/bin/xauth list | grep "unix:$DISPLAY_NUMBER"`
> /usr/bin/jconsole
> -J-Djava.class.path=/usr/lib/jvm/java/lib/jconsole.jar:/usr/lib/jvm/java/lib/tools.jar
> -------------------------------
>
>
> [image: Inline image 1]
>
> Thanks.
>
>                          -Shanti
>
>
> On Tue, Oct 2, 2012 at 7:43 PM, Kirill Kireyev <kirill@instagrok.com>wrote:
>
>>  Hi Shanti! Thanks for your script! I ran it (with minor modifications),
>> and it turns out that the garbage collection threads are indeed taking up
>> the majority of CPU (see below). What would you recommend I do to
>> tune/troubleshoot GC further? My server has 32G or RAM, JVM has heap size
>> of 26G
>>
>> Thanks,
>> Kirill
>>
>> PID    CPU    Mem    JStack Info
>> 0x253d    80    90.7    "GC task thread#3 (ParallelGC)" prio=10
>> tid=0x0000000001938800 nid=0x253d runnable
>> 0x2540    76    90.7    "GC task thread#6 (ParallelGC)" prio=10
>> tid=0x000000000193e000 nid=0x2540 runnable
>> 0x253b    74    90.7    "GC task thread#1 (ParallelGC)" prio=10
>> tid=0x0000000001934800 nid=0x253b runnable
>> 0x253e    74    90.7    "GC task thread#4 (ParallelGC)" prio=10
>> tid=0x000000000193a000 nid=0x253e runnable
>> 0x2541    74    90.7    "GC task thread#7 (ParallelGC)" prio=10
>> tid=0x000000000193f800 nid=0x2541 runnable
>> 0x253c    66    90.7    "GC task thread#2 (ParallelGC)" prio=10
>> tid=0x0000000001936800 nid=0x253c runnable
>> 0x253a    47    90.7    "GC task thread#0 (ParallelGC)" prio=10
>> tid=0x0000000001933000 nid=0x253a runnable
>> 0x253f    39    90.7    "GC task thread#5 (ParallelGC)" prio=10
>> tid=0x000000000193c000 nid=0x253f runnable
>> 0x2542    19    90.7    "VM Thread" prio=10 tid=0x00000000019b6000
>> nid=0x2542 runnable
>> 0x2539    0    90.7    "main" prio=10 tid=0x0000000001925000 nid=0x2539
>> runnable [0x00007fee9d444000]
>> 0x2543    0    90.7    "Reference Handler" daemon prio=10
>> tid=0x00000000019bd800 nid=0x2543 runnable [0x00007fee500d8000]
>> 0x2544    0    90.7    "Finalizer" daemon prio=10 tid=0x00000000019c5800
>> nid=0x2544 in Object.wait() [0x00007fee4ffd7000]
>> 0x2545    0    90.7    "Signal Dispatcher" daemon prio=10
>> tid=0x0000000001a12000 nid=0x2545 runnable [0x0000000000000000]
>> 0x2546    0    90.7    "C2 CompilerThread0" daemon prio=10
>> tid=0x00007fee48001000 nid=0x2546 waiting on condition [0x0000000000000000]
>> 0x2547    0    90.7    "C2 CompilerThread1" daemon prio=10
>> tid=0x00007fee48004000 nid=0x2547 waiting on condition [0x0000000000000000]
>> 0x2548    0    90.7    "Service Thread" daemon prio=10
>> tid=0x0000000001a14800 nid=0x2548 runnable [0x0000000000000000]
>> 0x2549    0    90.7    "VM Periodic Task Thread" prio=10
>> tid=0x0000000001a1f800 nid=0x2549 waiting on condition
>> 0x254a    0    90.7    "GC Daemon" daemon prio=10 tid=0x00007fee404b3000
>> nid=0x254a in Object.wait() [0x00007fee4e1bd000]
>> 0x254d    0    90.7
>> "ContainerBackgroundProcessor[StandardEngine[Catalina]]" daemon prio=10
>> tid=0x00007fee40389800 nid=0x254d runnable [0x00007fee4d820000]
>> 0x254e    0    90.7    "http-apr-8080-Poller-0" daemon prio=10
>> tid=0x00007fee40378000 nid=0x254e in Object.wait() [0x00007fee4d661000]
>> 0x254f    0    90.7    "http-apr-8080-CometPoller-0" daemon prio=10
>> tid=0x00007fee40395000 nid=0x254f in Object.wait() [0x00007fee4d4a2000]
>> 0x2550    0    90.7    "http-apr-8080-Sendfile-0" daemon prio=10
>> tid=0x00007fee40397000 nid=0x2550 in Object.wait() [0x00007fee4d2b0000]
>> 0x2551    0    90.7    "http-apr-8080-Acceptor-0" daemon prio=10
>> tid=0x00007fee4072c000 nid=0x2551 runnable [0x00007fee4d1af000]
>> 0x2552    0    90.7    "http-apr-8080-AsyncTimeout" daemon prio=10
>> tid=0x00007fee4072e000 nid=0x2552 sleeping[0x00007fee4d0ac000]
>> 0x2553    0    90.7    "http-apr-8443-Poller-0" daemon prio=10
>> tid=0x00007fee40730000 nid=0x2553 in Object.wait() [0x00007fee4cfab000]
>> 0x2554    0    90.7    "http-apr-8443-CometPoller-0" daemon prio=10
>> tid=0x00007fee40732000 nid=0x2554 in Object.wait() [0x00007fee4cdee000]
>> 0x2555    0    90.7    "http-apr-8443-Acceptor-0" daemon prio=10
>> tid=0x00007fee40734800 nid=0x2555 runnable [0x00007fee4cced000]
>> 0x2556    0    90.7    "http-apr-8443-AsyncTimeout" daemon prio=10
>> tid=0x00007fee4073b800 nid=0x2556 sleeping[0x00007fee4cbec000]
>> 0x2557    0    90.7    "ajp-apr-8009-Poller-0" daemon prio=10
>> tid=0x00007fee4073d800 nid=0x2557 runnable [0x00007fee4ca2f000]
>> 0x2558    0    90.7    "ajp-apr-8009-CometPoller-0" daemon prio=10
>> tid=0x00007fee4073f800 nid=0x2558 in Object.wait() [0x00007fee4c872000]
>> 0x2559    0    90.7    "ajp-apr-8009-Acceptor-0" daemon prio=10
>> tid=0x00007fee40741800 nid=0x2559 runnable [0x00007fee4c771000]
>> 0x255a    0    90.7    "ajp-apr-8009-AsyncTimeout" daemon prio=10
>> tid=0x00007fee40743800 nid=0x255a sleeping[0x00007fee4c670000]
>> 0x255f    0    90.7    "ajp-apr-8009-exec-5" daemon prio=10
>> tid=0x0000000001cef000 nid=0x255f runnable [0x00007fee4c157000]
>> 0x2560    0    90.7    "ajp-apr-8009-exec-6" daemon prio=10
>> tid=0x0000000001e93800 nid=0x2560 runnable [0x00007fee47ffd000]
>> 0x2561    0    90.7    "ajp-apr-8009-exec-7" daemon prio=10
>> tid=0x0000000001c1e800 nid=0x2561 runnable [0x00007fee4db39000]
>> 0x2562    0    90.7    "ajp-apr-8009-exec-8" daemon prio=10
>> tid=0x0000000001d41000 nid=0x2562 runnable [0x00007fee4dc3a000]
>> 0x2563    0    90.7    "ajp-apr-8009-exec-9" daemon prio=10
>> tid=0x0000000001bf9800 nid=0x2563 runnable [0x00007fee47efb000]
>> 0x2564    0    90.7    "ajp-apr-8009-exec-10" daemon prio=10
>> tid=0x0000000001cfe800 nid=0x2564 waiting on condition [0x00007fee47dfa000]
>> 0x2586    0    90.7    "ajp-apr-8009-exec-12" daemon prio=10
>> tid=0x0000000001e85000 nid=0x2586 waiting on condition [0x00007fee47bf6000]
>> 0x2587    0    90.7    "pool-1-thread-1" prio=10 tid=0x0000000001fe4000
>> nid=0x2587 waiting on condition [0x00007fee47af3000]
>> 0x2635    0    90.7    "ajp-apr-8009-exec-15" daemon prio=10
>> tid=0x000000000232d800 nid=0x2635 runnable [0x00007fee38991000]
>> 0x2637    0    90.7    "ajp-apr-8009-exec-17" daemon prio=10
>> tid=0x0000000002392800 nid=0x2637 waiting on condition [0x00007fee38790000]
>> 0x2638    0    90.7    "ajp-apr-8009-exec-18" daemon prio=10
>> tid=0x0000000002395000 nid=0x2638 waiting on condition [0x00007fee3868f000]
>> 0x263a    0    90.7    "ajp-apr-8009-exec-20" daemon prio=10
>> tid=0x0000000002312800 nid=0x263a waiting on condition [0x00007fee3848b000]
>> 0x263d    0    90.7    "ajp-apr-8009-exec-23" daemon prio=10
>> tid=0x0000000002324800 nid=0x263d runnable [0x00007fee38187000]
>> 0x50b    0    90.7    "http-apr-8443-exec-1" daemon prio=10
>> tid=0x0000000002209000 nid=0x50b waiting on condition [0x00007fee232b1000]
>> 0x50c    0    90.7    "http-apr-8443-exec-2" daemon prio=10
>> tid=0x000000000213c000 nid=0x50c waiting on condition [0x00007fee233b2000]
>> 0x50d    0    90.7    "http-apr-8443-exec-3" daemon prio=10
>> tid=0x000000000239e000 nid=0x50d waiting on condition [0x00007fee22baa000]
>> 0x50e    0    90.7    "http-apr-8443-exec-4" daemon prio=10
>> tid=0x0000000002aeb800 nid=0x50e waiting on condition [0x00007fee3939d000]
>> 0x50f    0    90.7    "http-apr-8443-exec-5" daemon prio=10
>> tid=0x0000000002ad3000 nid=0x50f waiting on condition [0x00007fee38f99000]
>> 0x510    0    90.7    "http-apr-8443-exec-6" daemon prio=10
>> tid=0x00000000025a1800 nid=0x510 waiting on condition [0x00007fee397a1000]
>> 0x511    0    90.7    "http-apr-8443-exec-7" daemon prio=10
>> tid=0x0000000002188800 nid=0x511 waiting on condition [0x00007fee45db8000]
>> 0x512    0    90.7    "http-apr-8443-exec-8" daemon prio=10
>> tid=0x00000000023b3800 nid=0x512 waiting on condition [0x00007fee44aa5000]
>> 0x513    0    90.7    "http-apr-8443-exec-9" daemon prio=10
>> tid=0x0000000002164800 nid=0x513 waiting on condition [0x00007fee3dae4000]
>> 0x514    0    90.7    "http-apr-8443-exec-10" daemon prio=10
>> tid=0x0000000002af6800 nid=0x514 waiting on condition [0x00007fee279f8000]
>> 0x19a7    0    90.7    "http-apr-8080-exec-1" daemon prio=10
>> tid=0x0000000001ac2800 nid=0x19a7 waiting on condition [0x00007fee3747b000]
>> 0x19a8    0    90.7    "http-apr-8080-exec-2" daemon prio=10
>> tid=0x000000000257d000 nid=0x19a8 waiting on condition [0x00007fee3767d000]
>> 0x28b0    0    90.7    "http-apr-8080-exec-3" daemon prio=10
>> tid=0x000000000274c000 nid=0x28b0 waiting on condition [0x00007fee3f8f7000]
>> 0x28b1    0    90.7    "http-apr-8080-exec-4" daemon prio=10
>> tid=0x00000000021c8800 nid=0x28b1 waiting on condition [0x00007fee3adb7000]
>> 0x46e7    0    90.7    "Keep-Alive-Timer" daemon prio=10
>> tid=0x0000000002160800 nid=0x46e7 runnable [0x00007fee3d5df000]
>> 0x29c5    0    90.7    "Keep-Alive-SocketCleaner" daemon prio=10
>> tid=0x000000000242b000 nid=0x29c5 in Object.wait() [0x00007fee1f877000]
>> 0x6ef8    0    90.7    "http-apr-8080-exec-5" daemon prio=10
>> tid=0x0000000002a56000 nid=0x6ef8 waiting on condition [0x00007fee454af000]
>> 0x6ef9    0    90.7    "http-apr-8080-exec-6" daemon prio=10
>> tid=0x00000000022de800 nid=0x6ef9 waiting on condition [0x00007fee45ab5000]
>> 0x3d38    0    90.7    "http-apr-8080-exec-7" daemon prio=10
>> tid=0x0000000002233000 nid=0x3d38 waiting on condition [0x00007fee37980000]
>> 0x3d39    0    90.7    "http-apr-8080-exec-8" daemon prio=10
>> tid=0x00000000021f7800 nid=0x3d39 waiting on condition [0x00007fee38087000]
>> 0x64a3    0    90.7    "ajp-apr-8009-exec-36" daemon prio=10
>> tid=0x00000000022dd000 nid=0x64a3 waiting on condition [0x00007fee35158000]
>> 0x64a6    0    90.7    "ajp-apr-8009-exec-39" daemon prio=10
>> tid=0x0000000002add800 nid=0x64a6 runnable [0x00007fee3bdc7000]
>> 0x64a7    0    90.7    "ajp-apr-8009-exec-40" daemon prio=10
>> tid=0x00000000029c0800 nid=0x64a7 waiting on condition [0x00007fee24fce000]
>> 0x64ab    0    90.7    "ajp-apr-8009-exec-44" daemon prio=10
>> tid=0x0000000002924000 nid=0x64ab runnable [0x00007fee20f8d000]
>> 0x64ac    0    90.7    "ajp-apr-8009-exec-45" daemon prio=10
>> tid=0x0000000002555800 nid=0x64ac waiting on condition [0x00007fee256d5000]
>> 0x64af    0    90.7    "ajp-apr-8009-exec-48" daemon prio=10
>> tid=0x0000000001f8a800 nid=0x64af waiting on condition [0x00007fee24ccb000]
>> 0x64b0    0    90.7    "ajp-apr-8009-exec-49" daemon prio=10
>> tid=0x000000000257b000 nid=0x64b0 runnable [0x00007fee36d73000]
>> 0x64b1    0    90.7    "ajp-apr-8009-exec-50" daemon prio=10
>> tid=0x0000000001bb3000 nid=0x64b1 runnable [0x00007fee36b71000]
>> 0x64b2    0    90.7    "ajp-apr-8009-exec-51" daemon prio=10
>> tid=0x000000000224a000 nid=0x64b2 runnable [0x00007fee270ee000]
>> 0x64b3    0    90.7    "ajp-apr-8009-exec-52" daemon prio=10
>> tid=0x000000000216e800 nid=0x64b3 runnable [0x00007fee20c8a000]
>> 0xf1b    0    90.7    "http-apr-8080-exec-9" daemon prio=10
>> tid=0x0000000001c9e800 nid=0xf1b waiting on condition [0x00007fee3929c000]
>> 0xf1c    0    90.7    "http-apr-8080-exec-10" daemon prio=10
>> tid=0x0000000001e4c000 nid=0xf1c waiting on condition [0x00007fee3d8e2000]
>> 0x3eda    0    90.7    "ajp-apr-8009-exec-55" daemon prio=10
>> tid=0x0000000002155000 nid=0x3eda runnable [0x00007fee27cfb000]
>> 0x3edb    0    90.7    "ajp-apr-8009-exec-56" daemon prio=10
>> tid=0x000000000232f800 nid=0x3edb runnable [0x00007fee27bfa000]
>> 0x51c7    0    90.7    "ajp-apr-8009-exec-57" daemon prio=10
>> tid=0x000000000211c800 nid=0x51c7 waiting on condition [0x00007fee37c83000]
>>
>>
>>
>> On 9/27/12 12:17 PM, Shanti Suresh wrote:
>>
>> Hi Kirill,
>>
>> Like Mark, Bill and Jeff said, those threads are normal request-processing
>> threads.  I have included a script that might help with isolating high CPU
>> issues with Tomcat.
>>
>> Also, I think it might be helpful to see how the Java heap is performing as
>> well.
>> Please bring up Jconsole and let it run over the week.  Inspect the graphs
>> for Memory, CPU and threads.  Since you say that high CPU occurs
>> intermittently several times during the week and clears itself, I wonder if
>> it is somehow related with the garbage collection options you are using for
>> the server.  Or it may be a code-related problem.
>>
>> Things to look at may include:
>>
>> (1) Are high CPU times related to Java heap reductions happening at the
>> same time?  ==> GC possibly needs tuning
>> (2) Are high CPU times related to increase in thread usage?  ==> possible
>> livelock in looping code?
>> (3) how many network connections come into the Tomcat server during
>> high-CPU times?    Possible overload-related?
>>
>> Here is the script.  I made a couple of small changes, for e.g., changing
>> the username.  But didn't test it after the change.  During high-CPU times,
>> invoke the script a few times, say 30 seconds apart.  And then compare the
>> thread-dumps.  I like to use TDA for thread-dump analysis of Tomcat
>> thread-dumps.
>>
>> Mark, et al, please feel free to help me refine this script.  I would like
>> to have a script to catch STUCK threads too :-)  Let me know if anyone has
>> a script already.  Thanks.
>>
>> --------------high_cpu_diagnostics.pl:-----
>> #!/usr/bin/perl
>> #
>>
>> use Cwd;
>>
>> # Make a dated directory for capturing current diagnostics
>> my ($sec,$min,$hour,$mday,$mon,$year,
>>           $wday,$yday,$isdst) = localtime time;
>> $year += 1900;
>> $mon += 1;
>> my $pwd = cwd();
>> my $preview_diag_dir = "/tmp/Preview_Diag.$year-$mon-$mday-$hour:$min:$sec";
>> print "$preview_diag_dir\n";
>> mkdir $preview_diag_dir, 0755;
>> chdir($preview_diag_dir) or die "Can't chdir into $preview_diag_dir $!\n";
>>
>> # Capture Preview thread dump
>> my $process_pattern = "preview";
>> my $preview_pid = `/usr/bin/pgrep -f $process_pattern`;
>> my $login = getpwuid($<) ;
>> if (kill 0, $preview_pid){
>>         #Possible to send a signal to the Preview Tomcat - either "webinf"
>> or "root"
>>         my $count = kill 3, $preview_pid;
>> }else {
>>         # Not possible to send a signal to the VCM - use "sudo"
>>         system ("/usr/bin/sudo /bin/kill -3 $preview_pid");
>> }
>>
>> # Capture Preview thread dump
>> system ("/usr/bin/jmap
>> -dump:format=b,file=$preview_diag_dir/preview_heapdump.hprof $preview_pid");
>>
>> # Gather the top threads; keep around for reference on what other threads
>> are running
>> @top_cmd = ("/usr/bin/top",  "-H", "-n1", "-b");
>> @sort_cmd = ("/bin/sort", "-r", "-n", "-k", "9,9");
>> @sed_cmd = ("/bin/sed", "-n", "'8,$p'");
>> system("@top_cmd 1> top_all_threads.log");
>>
>> # Get your tomcat user's threads, i.e. threads of user, "webinf"
>> system('/usr/bin/tail -n+6 top_all_threads.log | /bin/sort -r -n -k "9,9" |
>> /bin/grep webinf top_all_threads.log 1> top_user_webinf_threads.log');
>>
>> # Get the thread dump
>> my @output=`/usr/bin/jstack -l ${preview_pid}`;
>> open (my $file, '>', 'preview_threaddump.txt') or die "Could not open file:
>> $!";
>> print $file @output;
>> close $file;
>>
>> open LOG, "top_user_webinf_threads.log" or die $!;
>> open (STDOUT, "| tee -ai top_cpu_preview_threads.log");
>> print "PID\tCPU\tMem\tJStack Info\n";
>> while ($l = <LOG>) {
>>     chop $l;
>>     $pid = $l;
>>     $pid =~ s/webinf.*//g;
>>     $pid =~ s/ *//g;
>> ##  Hex PID is available in the Sun HotSpot Stack Trace */
>>     $hex_pid = sprintf("%#x", $pid);
>>     @values = split(/\s+/, $l);
>>     $pct = $values[8];
>>     $mem = $values[9];
>> # Debugger breakpoint:
>> $DB::single = 1;
>>
>> # Find the Java thread that corresponds to the thread-id from the TOP output
>>     for my $j (@output) {
>>         chop $j;
>>         ($j =~ m/nid=$hex_pid/)  &&  print $hex_pid . "\t" . $pct . "\t"
.
>> $mem . "\t" .  $j . "\n";
>>     }
>> }
>>
>> close (STDOUT);
>>
>> close LOG;
>>
>>
>> ------------------end of script --------------------------
>>
>> Thanks.
>>
>>                       -Shanti
>>
>>
>> On Thu, Sep 27, 2012 at 2:11 PM, Bill Miller <millebi.subscriptions@gmail.com>
wrote:
>>
>>
>>  I agree; we have reproducible instances where PermGen is not set to our
>> requirements on the Tomcat startup parameters and it will cause a "lockup"
>> every time. Do some JMX monitoring and you may discover a memory spike
>> that's killing Tomcat.
>>
>> Bill
>> -----Original Message-----
>> From: Jeff MAURY [mailto:jeffmaury@gmail.com <jeffmaury@gmail.com>]
>> Sent: September-27-2012 2:01 PM
>> To: Tomcat Users List
>> Subject: Re: high CPU usage on tomcat 7
>>
>> This is probably due to out of memory, I have the same problem on my ubuntu
>> ci machine Did you monitor your tomcat with jmx ?
>>
>> Jeff
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: users-help@tomcat.apache.org
>>
>>
>>
>> --
>> *Kirill Kireyev, PhD*
>> Founder/CTO instaGrok.com <http://www.instagrok.com>
>> kirill@instagrok.com
>> Twitter: @instaGrok <http://twitter.com/InstaGrok>
>> FB: facebook.com/instagrok <http://www.facebook.com/instagrok>
>>  <http://www.instagrok.com>
>>
>
>

Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message