tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jesse Klaasse <jesse.klaa...@indicia.nl>
Subject Re: IIS 6.0 / JK1.2.25 / Tomcat 5.5.20 - "Service temporary unavailable"
Date Thu, 03 Jul 2008 09:09:23 GMT

Hello Rainer,

First of all, thank you for your extensive answer and the time you have
taken to write the answer, this really gives me hope.


Rainer Jung-3 wrote:
> 
> Double check: The worker is a member of a load balancer. the member is 
> *not* in state STOP (because that is a configuration state) but in ERROR 
> (which is a runtime detected state).
> 

You are right, the worker is (the only) member of a load balancer. It is in
"OK" state until Tomcat hangs. Then it changes into "ERROR".


Rainer Jung-3 wrote:
> 
> First: you don't use a reply_timeout? At this stage you shouldn't, just 
> want to make sure.
> 

I haven't configured a reply_timeout.


Rainer Jung-3 wrote:
> 
> How to do thread dumps: if Tomcat is running from a DOS box, you can use 
> CTRL-Break on the keyboard (and the dumps go directly to the DOS box), 
> if it is running as a service, there is an entry in the context menue of 
> the tomcat monitor icon (system tray), and the dumps go to the service 
> log file.
> 

Tomcat was hanging a few minutes ago, and I have created some thread dumps,
which are available in the uploaded ZIP file.


Rainer Jung-3 wrote:
> 
> Use "netstat -an" on the IIS system and the Tomcat system (if they are 
> not the same) to produce a list of TCP connections and their state.
> 

For your information: Tomcat and IIS are on the same system. I have also
included a few netstat logs from the moment hanging. They can also be found
in the attached zip file.


Rainer Jung-3 wrote:
> 
> If possible use wireshark to produce a full packet dump of the 
> communications between the two for a minute or so, namely long enough, 
> that the cited log message occur a few times.
> 

I have downloaded and installed Wireshark. I have included a few minutes of
Wireshark captured data in the zip file too.


Rainer Jung-3 wrote:
> 
> - remove the socket_timeout
> and
> - remove the APR connector (tcnative)
> 
> If this solves the problem, check, if removing only of of them suffices.
> If this quick test indicates APR connector as problematic, upgrade to 
> 1.1.13 (or the soon to appear 1.1.14).
> 

I have already tried to remove the APR connector, but this was really not a
good idea. Without APR, Tomcat hung after only one hour of normal use. With
APR, it lasts for about half a day. During the last hang, I downgraded APR
to 1.1.10, which we were using before 1.1.12, and which seems to be a little
more stable. I haven't been able to find 1.1.13 for Windows x64. Is it
available? I tried the http://tomcat.heanet.ie/native/ link.

Should I really try to remove the socket_timeout? Should I try this before
setting the reply_timeout to 60 seconds, as you state later in your mail?


Rainer Jung-3 wrote:
> 
> The log information in 1.2.26 should be more precise though. At least 
> for me ;)
> 

When we used 1.2.26, logging was more precise indeed. But it seemed to be
less stable, although I'm not sure if this has anything to do with the
connector version, since I also changed the tcnative version.


Rainer Jung-3 wrote:
> 
> Here I guess: since there was no reply_timeout set, the socket_timeout 
> fires after 10 seconds, aborts the wait and resets the connection. If 
> you can log response times with IIS, you could check, if they are above 
> 10 seconds. You can also log response times with an appropriate 
> JkRequestLogFormat.
> 

How should I set the JkRequestLogFormat? Isn't that an Apache (webserver)
directive? I am (and have to be - company policies) using IIS.


Rainer Jung-3 wrote:
> 
> You could set a reply timeout to a huge value, like eg. 60 seconds, if 
> you think that even under load *all* requests should return in less than 
> 60 seconds. We can optimize this setting later (with max_reply_timeouts 
> in 1.2.26).
> 

I will try this, but not yet. Not all at the same time :)


Rainer Jung-3 wrote:
> 
> You could try TCP tuning like in
> http://support.microsoft.com/kb/191143
> 
> but I doubt, that this will resolve the root consequence.
> 

This sounds unlikely to me too, so this will be a last resort maybe..


Rainer Jung-3 wrote:
> 
> Aha, if this is really coming before the error "60", then you should aso 
> look at:
> 
> http://support.microsoft.com/kb/931319/
> 

Sounds like it could be helping. I have installed the hotfix. But, the
system needs to be restarted in order to active the hotfix (argh!),
something I can't just do when when the traffic is high. Maybe I'll reboot
the server tonight.


Rainer Jung-3 wrote:
> 
> Maybe too many suggestions and not a straight solution, but if you are 
> able to collect more information, we should be able to sort this out.
> 

I hope the logs/dumps will help you.. I will look into them myself also.


Rainer Jung-3 wrote:
> 
> Do others have the same issue on Windows? Did they find a solution?
> 

I have searched all over the web, and there is a lot of information about
this whole setup, but it's very fragmented and the opinions are pretty wide
spread.

Again, thank you very much for your help and time so far. I hope we will be
able to resolve this problem!

http://www.nabble.com/file/p18255109/20080703_tomcat_hang_dumps.zip
20080703_tomcat_hang_dumps.zip 
-- 
View this message in context: http://www.nabble.com/IIS-6.0---JK1.2.25---Tomcat-5.5.20---%22Service-temporary-unavailable%22-tp18238896p18255109.html
Sent from the Tomcat - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Mime
View raw message