Return-Path: Delivered-To: apmail-cocoon-dev-archive@www.apache.org Received: (qmail 92161 invoked from network); 23 Dec 2005 08:15:00 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 23 Dec 2005 08:15:00 -0000 Received: (qmail 76453 invoked by uid 500); 23 Dec 2005 08:14:56 -0000 Delivered-To: apmail-cocoon-dev-archive@cocoon.apache.org Received: (qmail 76212 invoked by uid 500); 23 Dec 2005 08:14:54 -0000 Mailing-List: contact dev-help@cocoon.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: Reply-To: dev@cocoon.apache.org List-Id: Delivered-To: mailing list dev@cocoon.apache.org Received: (qmail 76185 invoked by uid 99); 23 Dec 2005 08:14:54 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 23 Dec 2005 00:14:53 -0800 X-ASF-Spam-Status: No, hits=1.4 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (asf.osuosl.org: 64.151.110.219 is neither permitted nor denied by domain of Ralph.Goers@dslextreme.com) Received: from [64.151.110.219] (HELO mail.gosmtp.com) (64.151.110.219) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 23 Dec 2005 00:14:52 -0800 Received: from [192.168.0.15] (ip68-98-70-115.ph.ph.cox.net [68.98.70.115]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.gosmtp.com (Postfix) with ESMTP id 02F2118B for ; Thu, 22 Dec 2005 23:41:21 -0800 (PST) Message-ID: <43ABB1E7.5090203@dslextreme.com> Date: Fri, 23 Dec 2005 00:14:31 -0800 From: Ralph Goers Reply-To: rgoers@apache.org User-Agent: Mozilla Thunderbird 1.0.6 (Windows/20050716) X-Accept-Language: en-us, en MIME-Version: 1.0 To: dev@cocoon.apache.org Subject: Re: Cocoon 2.1.7 hang References: <43A206F9.8040808@dslextreme.com> <43A20D3F.9000402@agssa.net> <43A21018.5040901@dslextreme.com> <43AAED93.10703@dslextreme.com> <16D1C0D9-16F1-4333-ADC2-DF8BDEEAA75D@betaversion.org> In-Reply-To: <16D1C0D9-16F1-4333-ADC2-DF8BDEEAA75D@betaversion.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N We are running Sun JDK 1.4.2_05 on RHEL 3. The Tomcat is 5.0 something (I'm out of tow at the moment so I don't have access to the info). Frankly, I'm suspecting that ehcache is returning a bad document to Castor although I don't have any proof. But if that is the case I really would have expected Castor to get or throw an exception, not just call the classloader over and over. The lock at 0x60b19148 was held by the last thread which was "http-8080-Processor26" daemon prio=1 tid=0x0821b148 nid=0x1e8b waiting for monitor entry [2cafd000..2caff87c] at java.lang.String.replace(String.java:1555) at java.net.URLClassLoader$1.run(URLClassLoader.java:190) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:187) at java.lang.ClassLoader.loadClass(ClassLoader.java:289) - locked <0x60b1d920> (a sun.misc.Launcher$ExtClassLoader) at java.lang.ClassLoader.loadClass(ClassLoader.java:282) - locked <0x60b19148> (a sun.misc.Launcher$AppClassLoader) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:274) - locked <0x60b19148> (a sun.misc.Launcher$AppClassLoader) Ralph Pier Fumagalli wrote: > On 22 Dec 2005, at 18:16, Ralph Goers wrote: > >> We finally got some thread dumps from our production server. It >> shows something very different than what we were seeing in testing. >> First, this happens under light load after running for days. To >> summarize, many threads are waiting for the ResourceLimitingPool and >> several are waiting for the class loader. This system hasn't had the >> pools tuned so I'm not surprised about pool contention, but I don't >> believe that is the issue. That is because the thread holding the >> lock is simply waiting for the class loader. >> We took two traces and both were similar, but not identical. >> Different threads were holding the class loader lock in both. >> However, in both cases the threads holding the class loader lock >> were called from Castor while creating the portal layout. >> >> So far, we have been speculating that the problem is due to a >> problem with the NPTL threads on Enterprise Linux 3. However, I'm >> wondering if perhaps castor is having problems and simply calling >> the class loader over and over. >> >> I'd appreciate any ideas. > > > Ok, as far as I can see down the dumps you might have some problems > with Catalina's classloader implementation locking up at 0x60b19148: > > at org.apache.catalina.loader.WebappClassLoader.loadClass > (WebappClassLoader.java:1255) > > That seems odd though... I thought that code was debugged pretty > thoroughly, unless, a seconday lock at 0x60cd9970 prevents the first > one to be released... > > Anyhow, from my experience, NPTL don't cause any whatsoever problem > under Linux, but that said, I'm running on Jetty 4 with BEA JRockit > 1.4.2. What VM and what container are you actually using? > > Pier > >