From Sam Hokin <>
Subject Re: very slow class loading on initial JSP/servlet request after restart
Date Tue, 24 Feb 2009 21:16:11 GMT
Juha Laiho wrote:
> One tool that I haven't yet seen suggested is 'strace', the Linux system
> call tracer. This will show all the calls your application makes to the
> operating system. As you say the application is mostly idle during the
> delay, it is, in one way or another, waiting for some OS service to
> complete. 'strace' should provide you with timestamped information on
> what OS services were called, with which arguments, and how long did
> it take for them to return with results. 'strace' will leave you with
> a huge file (or a set of huge files, depending on the options you use),
> and going through them will take some time - but you'll most likely
> also find what causes the delay.

Thanks, Juha.  Actually Pieter suggested it a little while ago, and I've been trying to get
some information out of 
strace.  The best I can do is to put strace in front of the java command that's inside
 That's the command 
that shows with ps -ef when Tomcat is running.  BUT, I get nothing out of strace when I make
page requests on a site, it 
just shows output during Tomcat startup.  So, I've not figured out how to get strace to say
what the JVM is during the 
delay.  jstack has led us to a stalled File.exists() in one case, but we don't know what file
it's looking for.  And I'm 
not convinced that File.exists() is the only method that's stalling.

Since this problem exists only on a production server, a server on which I must still serve
at least two customer sites 
(due to DNS issues) in addition to our own and any others I put on there, I'm a bit restricted
in terms of how much I 
can muck with it (not that I haven't brought those live sites down for awkward periods of
time with the diagnosis I've 
attempted so far).  I wish I had a test environment on another server that replicates this
issue, but my other two 
servers run Tomcat perfectly fast, and since I don't understand what's causing the problem,
I cannot make one of my 
other servers reproduce it.

Another diagnostic problem is that undeploying a context with the Tomcat /manager app, and
then starting it again, does 
NOT reset this problem - the response to a JSP request is immediate (provided it had been
requested since the last 
Tomcat startup).  This problem is only reset on a given JSP if I restart Tomcat entirely;
I can reproduce it by creating 
fresh JSPs with new names and requesting them.

But, clearly, the key diagnostic issue is finding out WHAT is going on during the delay that
a JSP incurs when it is 
first requested of a given Tomcat instance.  I've not been able to find out from strace. 
I'll give truss -f and truss 
-ff a try.

