Return-Path: Delivered-To: apmail-tomcat-users-archive@www.apache.org Received: (qmail 22389 invoked from network); 3 Jul 2008 09:10:09 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 3 Jul 2008 09:10:09 -0000 Received: (qmail 90215 invoked by uid 500); 3 Jul 2008 09:09:58 -0000 Delivered-To: apmail-tomcat-users-archive@tomcat.apache.org Received: (qmail 90197 invoked by uid 500); 3 Jul 2008 09:09:58 -0000 Mailing-List: contact users-help@tomcat.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Tomcat Users List" Delivered-To: mailing list users@tomcat.apache.org Received: (qmail 90186 invoked by uid 99); 3 Jul 2008 09:09:58 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Jul 2008 02:09:58 -0700 X-ASF-Spam-Status: No, hits=2.6 required=10.0 tests=DNS_FROM_OPENWHOIS,SPF_HELO_PASS,SPF_PASS,WHOIS_MYPRIVREG X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of lists@nabble.com designates 216.139.236.158 as permitted sender) Received: from [216.139.236.158] (HELO kuber.nabble.com) (216.139.236.158) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Jul 2008 09:09:05 +0000 Received: from isper.nabble.com ([192.168.236.156]) by kuber.nabble.com with esmtp (Exim 4.63) (envelope-from ) id 1KEKod-00075m-VO for users@tomcat.apache.org; Thu, 03 Jul 2008 02:09:24 -0700 Message-ID: <18255109.post@talk.nabble.com> Date: Thu, 3 Jul 2008 02:09:23 -0700 (PDT) From: Jesse Klaasse To: users@tomcat.apache.org Subject: Re: IIS 6.0 / JK1.2.25 / Tomcat 5.5.20 - "Service temporary unavailable" In-Reply-To: <486BD717.7030304@kippdata.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Nabble-From: jesse.klaasse@indicia.nl References: <2A999A47A42ED94185DCE3BE09C3938E02B1A2D6@AURORA.nl.indicia.org> <486BD717.7030304@kippdata.de> X-Virus-Checked: Checked by ClamAV on apache.org Hello Rainer, First of all, thank you for your extensive answer and the time you have taken to write the answer, this really gives me hope. Rainer Jung-3 wrote: > > Double check: The worker is a member of a load balancer. the member is > *not* in state STOP (because that is a configuration state) but in ERROR > (which is a runtime detected state). > You are right, the worker is (the only) member of a load balancer. It is in "OK" state until Tomcat hangs. Then it changes into "ERROR". Rainer Jung-3 wrote: > > First: you don't use a reply_timeout? At this stage you shouldn't, just > want to make sure. > I haven't configured a reply_timeout. Rainer Jung-3 wrote: > > How to do thread dumps: if Tomcat is running from a DOS box, you can use > CTRL-Break on the keyboard (and the dumps go directly to the DOS box), > if it is running as a service, there is an entry in the context menue of > the tomcat monitor icon (system tray), and the dumps go to the service > log file. > Tomcat was hanging a few minutes ago, and I have created some thread dumps, which are available in the uploaded ZIP file. Rainer Jung-3 wrote: > > Use "netstat -an" on the IIS system and the Tomcat system (if they are > not the same) to produce a list of TCP connections and their state. > For your information: Tomcat and IIS are on the same system. I have also included a few netstat logs from the moment hanging. They can also be found in the attached zip file. Rainer Jung-3 wrote: > > If possible use wireshark to produce a full packet dump of the > communications between the two for a minute or so, namely long enough, > that the cited log message occur a few times. > I have downloaded and installed Wireshark. I have included a few minutes of Wireshark captured data in the zip file too. Rainer Jung-3 wrote: > > - remove the socket_timeout > and > - remove the APR connector (tcnative) > > If this solves the problem, check, if removing only of of them suffices. > If this quick test indicates APR connector as problematic, upgrade to > 1.1.13 (or the soon to appear 1.1.14). > I have already tried to remove the APR connector, but this was really not a good idea. Without APR, Tomcat hung after only one hour of normal use. With APR, it lasts for about half a day. During the last hang, I downgraded APR to 1.1.10, which we were using before 1.1.12, and which seems to be a little more stable. I haven't been able to find 1.1.13 for Windows x64. Is it available? I tried the http://tomcat.heanet.ie/native/ link. Should I really try to remove the socket_timeout? Should I try this before setting the reply_timeout to 60 seconds, as you state later in your mail? Rainer Jung-3 wrote: > > The log information in 1.2.26 should be more precise though. At least > for me ;) > When we used 1.2.26, logging was more precise indeed. But it seemed to be less stable, although I'm not sure if this has anything to do with the connector version, since I also changed the tcnative version. Rainer Jung-3 wrote: > > Here I guess: since there was no reply_timeout set, the socket_timeout > fires after 10 seconds, aborts the wait and resets the connection. If > you can log response times with IIS, you could check, if they are above > 10 seconds. You can also log response times with an appropriate > JkRequestLogFormat. > How should I set the JkRequestLogFormat? Isn't that an Apache (webserver) directive? I am (and have to be - company policies) using IIS. Rainer Jung-3 wrote: > > You could set a reply timeout to a huge value, like eg. 60 seconds, if > you think that even under load *all* requests should return in less than > 60 seconds. We can optimize this setting later (with max_reply_timeouts > in 1.2.26). > I will try this, but not yet. Not all at the same time :) Rainer Jung-3 wrote: > > You could try TCP tuning like in > http://support.microsoft.com/kb/191143 > > but I doubt, that this will resolve the root consequence. > This sounds unlikely to me too, so this will be a last resort maybe.. Rainer Jung-3 wrote: > > Aha, if this is really coming before the error "60", then you should aso > look at: > > http://support.microsoft.com/kb/931319/ > Sounds like it could be helping. I have installed the hotfix. But, the system needs to be restarted in order to active the hotfix (argh!), something I can't just do when when the traffic is high. Maybe I'll reboot the server tonight. Rainer Jung-3 wrote: > > Maybe too many suggestions and not a straight solution, but if you are > able to collect more information, we should be able to sort this out. > I hope the logs/dumps will help you.. I will look into them myself also. Rainer Jung-3 wrote: > > Do others have the same issue on Windows? Did they find a solution? > I have searched all over the web, and there is a lot of information about this whole setup, but it's very fragmented and the opinions are pretty wide spread. Again, thank you very much for your help and time so far. I hope we will be able to resolve this problem! http://www.nabble.com/file/p18255109/20080703_tomcat_hang_dumps.zip 20080703_tomcat_hang_dumps.zip -- View this message in context: http://www.nabble.com/IIS-6.0---JK1.2.25---Tomcat-5.5.20---%22Service-temporary-unavailable%22-tp18238896p18255109.html Sent from the Tomcat - User mailing list archive at Nabble.com. --------------------------------------------------------------------- To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org For additional commands, e-mail: users-help@tomcat.apache.org