Return-Path: Delivered-To: apmail-tomcat-users-archive@www.apache.org Received: (qmail 40721 invoked from network); 6 Sep 2007 15:52:36 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 6 Sep 2007 15:52:36 -0000 Received: (qmail 93436 invoked by uid 500); 6 Sep 2007 15:52:17 -0000 Delivered-To: apmail-tomcat-users-archive@tomcat.apache.org Received: (qmail 93414 invoked by uid 500); 6 Sep 2007 15:52:17 -0000 Mailing-List: contact users-help@tomcat.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Tomcat Users List" Delivered-To: mailing list users@tomcat.apache.org Received: (qmail 93403 invoked by uid 99); 6 Sep 2007 15:52:17 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Sep 2007 08:52:17 -0700 X-ASF-Spam-Status: No, hits=-4.0 required=10.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [195.227.30.246] (HELO datura.kippdata.de) (195.227.30.246) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Sep 2007 15:52:12 +0000 Received: from [195.227.30.148] (larix [195.227.30.148]) by datura.kippdata.de (8.13.5/8.13.5) with ESMTP id l86Fpo1D010774 for ; Thu, 6 Sep 2007 17:51:50 +0200 (CEST) Message-ID: <46E02216.5000001@kippdata.de> Date: Thu, 06 Sep 2007 17:51:50 +0200 From: Rainer Jung User-Agent: Thunderbird 1.5.0.8 (X11/20061110) MIME-Version: 1.0 To: Tomcat Users List Subject: Re: mod_jk errors and http status code 503 under load References: <7E16AF149D48734F9EDE7165CD09D7C40498AF@SRV-EXVS01-DAL.intervoice.int> In-Reply-To: <7E16AF149D48734F9EDE7165CD09D7C40498AF@SRV-EXVS01-DAL.intervoice.int> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Damien Corbishley wrote: > Hi Rainer, > > Apache: > Server version: Apache/2.0.55 > Server built: Mar 5 2007 11:10:17 > Server's Module Magic Number: 20020903:11 > > Yes, we are thinking about updating the Java, I was in the process of > checking the release notes to see if anything specific was mentioned. > > Mod JK: > mod_jk/1.2.15 OK, 1.2.15 is a little antique. At the moment we reached 1.2.25. > Worker or prefork MPM? - I don't understand the question, sorry. MPM = Multi processing module prefork = multiple processes, but only one thread per process worker = multiple processes with a couple of threads per process (default=25) The strings prefork or worker are also cintained in the output of "httpd -V". Example: ... Server compiled with.... -D APACHE_MPM_DIR="server/mpm/prefork" ... > enableLookups="false" redirectPort="8443" > protocol="AJP/1.3" > acceptCount="50" > minSpareThreads="240" > maxThreads="950" > backlog="50" > address="127.0.0.1" > debug="0" > /> You should have a connectionTimeout. > Tomcat is installed on the same machine as Apache. > > With respect to the number of connections to the tomcat port I can't > give a complete answer - looking at the crash I can see there are around > 4000 stuck at TIME_WAIT. Another 30 or so are at ESTABLISHED. All of those 4000 TIME_WAIT for port 8009? That would be very much, indicating, that you are rapidly opening and closing the connections between Apache/mod_jk and tomcat. Those connections are supposed to be reused for some time. > The mod jk configuration: This file only consits of comments? > ## THIS FILE MAY BE OVERRIDEN AT RUNTIME. MAKE SURE TOMCAT IS STOPED > ## WHEN YOU EDIT THE FILE. > > ## COMMENTS WILL BE _LOST_ > > ## DOCUMENTATION OF THE FORMAT IN JkMain javadoc. > > # Set the desired handler list > # handler.list=apr,request,channelJni > # > # Override the default port for the socketChannel > # channelSocket.port=8019 > # Default: > # channelUnix.file=${jkHome}/work/jk2.socket > # Just to check if the the config is working > # shm.file=${jkHome}/work/jk2.shm > > # In order to enable jni use any channelJni directive > # channelJni.disabled = 0 > # And one of the following directives: > > # apr.jniModeSo=/opt/apache2/modules/mod_jk2.so > > # If set to inprocess the mod_jk2 will Register natives itself > # This will enable the starting of the Tomcat from mod_jk2 > # apr.jniModeSo=inprocess > > bash-3.00# cat ./callflow-server/webserver/conf/workers.properties > workers.apache_log=/usr/local/apache2 This attribute never existed (so it's ignored). > workers.tomcat_home=/opt/tomcat > workers.java_home=/usr/j2se/ Thos two don't have a meaning any more. > worker.list= worker1 > > worker.worker1.type=ajp13 > > worker.worker1.host=127.0.0.1 > worker.worker1.port=8009 > worker.worker1.lbfactor=2.5 Not needed, if you are not using a JK load balancer. If you upgrade and later want to use a JK load balancer: newer JK versions use integer numbers here. > # Number of concurrent users. > worker.worker1.cachesize=950 > worker.worker1.cache_timeout=600 So set connectionTimeout in the AJP connector (server.xml) to 600000 (Milliseconds). > worker.worker1.socket_timeout=300 > > Yes, I agree with you about contacting SUN, I'm digging out our service > plan at the moment. I don't see a good reason, why you should get those error messages for this configuration. But you should update to a more recent mod_jk in order to research this problem from a better starting point (and maybe it will be gone). > Damien > > > -----Original Message----- > From: Rainer Jung [mailto:rainer.jung@kippdata.de] > Sent: Thursday, September 06, 2007 3:42 PM > To: Tomcat Users List > Subject: Re: mod_jk errors and http status code 503 under load > > Damien Corbishley wrote: >> Hi, >> >> I'm getting stuck with an error we are seeing on our production >> server under load. >> >> We upgraded our combination of Apache,Tomcat,Java all running on >> Solaris 10 to >> >> Apache 2_2.0.55.build2 > > This doesn't sound like a real Apache httpd version. You can check the > real version with "httpd -V" or when using log level info in the error > log of httpd. > > Worker or prefork MPM? > >> Tomcat 5.5.23 >> >> Java 1.5.0_06-b05 > > If you are in a general update situation, switch to 1.5.0_12 or whatever > > seems to be recent and not only a couple days old. > > And finally mod_jk-version? > >> Our Apache Tomcat connector is AJP/1.3. > > What's the configuration of the connector element in server.xml? > >> It's all installed on the same server, we have no tomcat clusters and >> no load balancers. >> >> With a few (10) concurrent users accessing the various JSP's >> everything worked as expected, all our functional tests passed. >> >> We opened more up live traffic to the server and starting seeing the >> following error message: >> >> [Thu Sep 06 02:23:00 2007] [error] ajp_service::jk_ajp_common.c >> (1758): Error connecting to tomcat. Tomcat is probably not started or >> is listening on the wrong port. worker=worker1 failed > > Is Tomcat on the same machine, or on another one? > > What do you get for "netstat -an| grep TOMCATPORT" where you replace > TOMCATPORT with the port number your Tomcat AJP connector listens on? > > How many ESTABLISHED, how many of other states (SYN_SENT, CLOSE_WAIT > etc.)? > > What's you mod_jk configuration? > >> On the client side this was being reported as a 503 and our >> "intelligent" browser was re-submitting the request and generally, >> >> on the second time, the request would be successful. >> >> After an hour we then had a Solaris panic, and after analyzing the >> crash using scat I can see that the cause was: > > Solaris panic looks very bad. I would definitely open a bug with sun. > I'm not saying that the original problem is there, but it is very > unusual, that the whole system panics. > >> panic string: BAD TRAP: type=9 rp=2a1015332d0 addr=0 mmu_fsr=0 >> >> and looking further into the trap, it seems to be network related. >> >> ip:tcp_conn_request+0x894 (0x4, 0x2000, 0x136ec00, >> 0x3a2b9a44040, 0x40e03960300, 0x40dedc247c0) >> >> ip:squeue_drain+0xe4 (0x300015ddec0, 0x40e2021c1c0, 0x411666de, >> 0x135f2c0, 0x3a2b9a44040, 0x3) >> >> ip:squeue_enter+0x358 (0x300015ddec0, 0x38fca5e0d80, 0x3000426b640, >> 0x3a2bd674c00, 0x0, 0x1) >> >> ip:tcp_wput - frame recycled >> >> >> >> I am still trying to work out if the crash is a symptom of the >> browsers resubmitting the failed requests and we've hit a Solaris bug >> that we haven't got a patch for, >> >> Or if the two errors are unrelated and it was coincidence that we >> panicked whilst having mod_jk errors. >> >> >> Any help/pointers in resolving the mod_jk error would be appreciated, >> I included the brief crash information as it may be relevant. >> >> Thanks >> >> Damien > > Regards, > > Rainer --------------------------------------------------------------------- To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org For additional commands, e-mail: users-help@tomcat.apache.org