Return-Path: Delivered-To: apmail-tomcat-users-archive@www.apache.org Received: (qmail 73704 invoked from network); 5 Feb 2010 11:33:45 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 5 Feb 2010 11:33:45 -0000 Received: (qmail 82068 invoked by uid 500); 5 Feb 2010 11:33:41 -0000 Delivered-To: apmail-tomcat-users-archive@tomcat.apache.org Received: (qmail 81993 invoked by uid 500); 5 Feb 2010 11:33:41 -0000 Mailing-List: contact users-help@tomcat.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Tomcat Users List" Delivered-To: mailing list users@tomcat.apache.org Received: (qmail 81982 invoked by uid 99); 5 Feb 2010 11:33:41 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Feb 2010 11:33:41 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [174.141.46.202] (HELO mars.etrak-plus.com) (174.141.46.202) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Feb 2010 11:33:34 +0000 Received: (qmail 3693 invoked by uid 1010); 5 Feb 2010 06:30:37 -0500 Received: from carl@etrak-plus.com by mars by uid 1007 with qmail-scanner-1.22-st-qms (spamassassin: 2.63. Clear:RC:1(192.168.0.106):. Processed in 0.021417 secs); 05 Feb 2010 11:30:37 -0000 X-Antivirus-MYDOMAIN-Mail-From: carl@etrak-plus.com via mars X-Antivirus-MYDOMAIN: 1.22-st-qms (Clear:RC:1(192.168.0.106):. Processed in 0.021417 secs Process 3689) Received: from unknown (HELO dan) (192.168.0.106) by mars.etrak-plus.com with SMTP; Fri, 05 Feb 2010 06:30:37 -0500 Message-ID: <02cc01caa656$d3a562b0$6a00a8c0@dan> From: "Carl" To: "Tomcat Users List" References: <669934.30949.qm@web56605.mail.re3.yahoo.com> Subject: Re: Tomcat dies suddenly Date: Fri, 5 Feb 2010 06:31:57 -0500 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_02C9_01CAA62C.EA799990" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.2180 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 ------=_NextPart_000_02C9_01CAA62C.EA799990 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Mark, Since I don't understand what is causing the problem, all information is = helpful and I appreciate you taking time to think about what could be = wrong. 1. The application runs fine on an older system. Do we have the glibc = and kernel versions for all systems? The old system: P4. 1GB memory, 1.3GB swap. Uses swap on a regular = basis. kernel is 2.4.25. Java is 1.5.0_01-b08. Tomcat is 5.5.23. = Glibc is version 2.3.1. New systems: Server A (Dell T110) is a Xeon 3440, sever B (Dell T105) is = an AMD. A has 4GB memory and 19GB swap which is never used. B has 6GB = memory and 10GB swap which is never used. A and B both use kernel = version 2.6.29.6, Java 1.6.0_18-b07 and Tomcat 6.0.24. Glibc version is = 4.3.3 for both A and B. 2. Different usage patterns (?) seem to cause the outages at different = rates (if I remember an account of one Friday). What paths in the = application were being exercised most heavily during that time? The outages appear to be most frequent in times of heavy transaction = processing. This part of the application is basically a shopping cart = although the path from start to 'in the cart' has many variations = depending on the type of item being registered for, i.e., the = registration process steps through 20+ classes each of which could = request additional processing, display a screen for user input, etc. It = seems during periods of heavy transactions, the system fails more = frequently but it may be that the application requires a certain = cumulative amount of activity before it fails and that activity can be = spread over several hours or several days. However, since Tomcat is = restarted once a day (around 1:00AM after rolling out changes), it would = seem that the application would not be able to carry activity from one = day into the next. Therefore, it would seem that the failure is = triggered by something on the day it occurs. Thanks for your help. Carl ----- Original Message -----=20 From: "Mark Eggers" To: "Tomcat Users List" Sent: Thursday, February 04, 2010 7:10 PM Subject: RE: Tomcat dies suddenly --- On Thu, 2/4/10, Caldarale, Charles R = wrote: > > 6. Carl was using 32-bit Linux, which he isn't :( >=20 > Correct, which made the whole point moot, so I'm not sure > why Dan even brought it up. >=20 I just mentioned the 32-bit Linux behavior for completeness. I did state = that I realized 32-bit Linux is not in play. > > AFAIK, 64-bit Linux has a wide-open memory addressing > scheme. Maybe it > > considers everything under 17 billion GiB to be "low > memory", now :) >=20 > No, the hardware restrictions don't exist in 64-bit mode. This is what I've read as well. If you use 64-bit Linux, this problem = goes away. There are also some ways to build the 32-bit kernel in order = to reduce this problem. All this is moot since a 64-bit Linux kernel is being used. As to the copy-on-write behavior for fork()d processes, it would help if = I read the man pages: Under Linux, fork() is implemented using copy-on-write pages, so the = only penalty that it incurs is the time and memory required to = duplicate the parent=E2=80=99s page tables, and to create a unique task = structure for the child. It turns out that things are a little bit more complicated than that, in = that since version 2.3.3 fork is actually a wrapper to clone(2) with the = appropriate flags to give the same result as a traditional fork(2) call. All of this is moot however if there is no Runtime.exec() call in the = application. I'm a bit curious though about several points: 1. The application runs fine on an older system. Do we have the glibc = and kernel versions for all systems? 2. Different usage patterns (?) seem to cause the outages at different = rates (if I remember an account of one Friday). What paths in the = application were being exercised most heavily during that time? As for cache / buffer / free - I've seen cases where the cache did not = go to 0, but swap was in play (slow disk, small amount of memory). Sorry for chasing down the rabbit hole . . . /mde/ =20 --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org For additional commands, e-mail: users-help@tomcat.apache.org ------=_NextPart_000_02C9_01CAA62C.EA799990--