Return-Path: Delivered-To: apmail-jakarta-jcs-users-archive@www.apache.org Received: (qmail 53330 invoked from network); 9 Apr 2008 18:03:37 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 9 Apr 2008 18:03:37 -0000 Received: (qmail 41464 invoked by uid 500); 9 Apr 2008 18:03:34 -0000 Delivered-To: apmail-jakarta-jcs-users-archive@jakarta.apache.org Received: (qmail 41443 invoked by uid 500); 9 Apr 2008 18:03:34 -0000 Mailing-List: contact jcs-users-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "JCS Users List" Delivered-To: mailing list jcs-users@jakarta.apache.org Received: (qmail 41406 invoked by uid 99); 9 Apr 2008 18:03:34 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Apr 2008 11:03:34 -0700 X-ASF-Spam-Status: No, hits=3.2 required=10.0 tests=HTML_MESSAGE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [217.154.246.189] (HELO lon-gs4dmrelay.mistral.net) (217.154.246.189) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Apr 2008 18:02:49 +0000 Received: from 78-86-124-231.zone2.bethere.co.uk ([78.86.124.231] helo=[192.168.10.55]) by lon-gs4dmrelay.mistral.net with esmtpa (Exim 4.51) id 1JjePl-0007rb-5R for jcs-users@jakarta.apache.org; Wed, 09 Apr 2008 18:48:53 +0100 Subject: Re: JCS remote server From: Niall Gallagher To: JCS Users List In-Reply-To: <47FCCE98.10109@loki.ws> References: <47FCCE98.10109@loki.ws> Content-Type: multipart/alternative; boundary="=-I5sn4obF8fpO//1hgxTP" Organization: Switchfire Ltd. Date: Wed, 09 Apr 2008 19:02:50 +0100 Message-Id: <1207764170.7082.44.camel@localhost> Mime-Version: 1.0 X-Mailer: Evolution 2.8.3 (2.8.3-2.fc6) X-Virus-Checked: Checked by ClamAV on apache.org --=-I5sn4obF8fpO//1hgxTP Content-Type: text/plain Content-Transfer-Encoding: 7bit Hi Josh, Can you modify your cron job to capture diagnostics before it restarts the cache server? Then you can post the diagnostics next time it happens. The script below will capture diagnostics for you. We use something like this in-house for troubleshooting (not specifically for JCS). You'll first have to run the JDK 'jps' command from either root, or the user account which runs your cache server instance. This gives you the "name" of your cache server JVM process, which you need to supply to the diagnostics script as command-line parameter. The script uses the name to attach to the relevant JVM process. I don't know what might be causing the problem for you. It could be a bug in JCS, or it could be a memory issue. The diagnostics will help identify the problem. Save this as "capture-diagnostics.sh"... ------- #!/bin/sh # Saves the stack traces and class memory usage information for a # Java process running on the machine to a diagnostics file. # # This script expects the name of the relevant Java process to be # specified as a parameter. The name specified should match a Java # process name as listed by running the JDK 'jps' command. # # Usage: sh capture-diagnostics.sh APP_NAME="$1" JDK_LOCATION="/usr/java/default" DUMP_FILE="$APP_NAME-diagnostics.txt" APP_PID="`$JDK_LOCATION/bin/jps|grep $APP_NAME 2> /dev/null|cut -d\ -f1`" if [ "$APP_PID" = "" ]; then echo "ERROR: Can't determine pid of Java process name specified \"$APP_NAME\"" echo "Usage: sh capture-diagnostics.sh " exit 20 fi echo "Capturing diagnostics for Java process \"$APP_NAME\" (pid $APP_PID)..." echo -e "Diagnostics for Java process \"$APP_NAME\" (pid $APP_PID) as at `date`:-" >> $DUMP_FILE echo -e "\nTop 30 memory-consuming classes:-" >> $DUMP_FILE $JDK_LOCATION/bin/jmap -histo:live $APP_PID |head -n33 >> $DUMP_FILE echo -e "\nThread stack traces:-" >> $DUMP_FILE $JDK_LOCATION/bin/jstack $APP_PID >> $DUMP_FILE echo -e "\n" >> $DUMP_FILE echo "Saved diagnostics for \"$APP_NAME\" to \"$DUMP_FILE\"" ------- On Wed, 2008-04-09 at 10:11 -0400, Joshua Szmajda wrote: > Hey all, > > I've got a JCS remote cache server running on a machine and every now > and then it will spiral out of control and lock the machine. I have no > idea yet what's causing this, I've just put some extra measures in place > to capture the logs from when it happens. My solution at this point is a > cron job that checks now and then for excessive cpu usage and restarts > the cache server. I'd like to be able to not worry about it, though :). > > Any suggestions? > > Thanks! > -Josh > > P.S. it's running on ubuntu-server (kernel 2.6.22-14-server). > I have up to 16 remote listeners connecting to any given region. > (probably 20 application instances in all). > Puts grow at a rate of about 400 per second. > I pass these options to java: "-Xms128m -Xmx2000m" > And here's my simple remote.cache.ccf: > > registry.host=localhost > registry.port=10021 > remote.cluster.LocalClusterConsistency=true > remote.cluster.AllowClusterGet=true > > jcs.default=DC > jcs.default.cacheattributes=org.apache.jcs.engine.CompositeCacheAttributes > jcs.default.cacheattributes.MaxObjects=10000 > jcs.default.cacheattributes.MemoryCacheName=org.apache.jcs.engine.memory.lru.LRUMemoryCache > jcs.default.cacheattributes.UseMemoryShrinker=true > jcs.default.cacheattributes.MaxMemoryIdleTimeSeconds=3600 > jcs.default.cacheattributes.ShrinkerIntervalSeconds=60 > jcs.default.elementattributes=org.apache.jcs.engine.ElementAttributes > jcs.default.elementattributes.IsEternal=false > jcs.default.elementattributes.MaxLifeSeconds=86400 > jcs.default.elementattributes.IdleTime=7200 > jcs.default.elementattributes.IsSpool=true > jcs.default.elementattributes.IsRemote=true > jcs.default.elementattributes.IsLateral=true > > jcs.auxiliary.DC=org.apache.jcs.auxiliary.disk.indexed.IndexedDiskCacheFactory > jcs.auxiliary.DC.attributes=org.apache.jcs.auxiliary.disk.indexed.IndexedDiskCacheAttributes > jcs.auxiliary.DC.attributes.DiskPath=/var/tmp/jcsServer > jcs.auxiliary.DC.attributes.MaxPurgatorySize=10000 > jcs.auxiliary.DC.attributes.MaxKeySize=100000 > jcs.auxiliary.DC.attributes.OptimizeAtRemoveCount=300000 > jcs.auxiliary.DC.attributes.OptimizeOnShutdown=true > jcs.auxiliary.DC.attributes.MaxRecycleBinSize=7500 > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: jcs-users-unsubscribe@jakarta.apache.org > For additional commands, e-mail: jcs-users-help@jakarta.apache.org --=-I5sn4obF8fpO//1hgxTP--