jakarta-jcs-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joshua Szmajda <j...@loki.ws>
Subject Re: JCS remote server
Date Thu, 10 Apr 2008 14:29:07 GMT
I'd been deleting the logs, so I don't have one right now ><. I did 
change my scripts to save them though. As soon as it happens again I'll 
have some data. It seems to take about a week or so of running from a 
fresh start before I start to get problems.

Niall: thanks for the explanation. I figured they were probably Byte 
arrays, but then I saw the Strings and that threw me off :).

Anyway as soon as I get some real data I'll post it to the list.

Thanks all!
-Josh

Aaron Smuts wrote:
> Do you have any of the cache logs when this is
> happening?
>
> I would turn the memory shrinker off (set the property
> to false), as a start.  I generally don't run with the
> memory shrinker on.  But I'm shooting in the dark.
>
> Aaron
>
>
> --- Joshua Szmajda <josh@loki.ws> wrote:
>
>   
>> Ahh yes of course, it was the user requirement. Now
>> I have a nice bunch 
>> of data. This is interesting, but I'm not sure what
>> the [B class is:
>>
>> num   #instances    #bytes  class name
>> --------------------------------------
>>   1:     31419   284852480  [B
>>   2:      2277    19760264  [I
>>   3:     57834     3865240  [C
>>   4:     29628     1896192 
>> org.apache.jcs.engine.ElementAttributes
>>   5:     57838     1388112  java.lang.String
>> ...
>>
>> Niall Gallagher wrote:
>>     
>>> Hmm :D
>>>
>>> I just did a bit of digging. I've used this script
>>>       
>> on a few of our
>>     
>>> servers in the past (32 and 64bit server VMs), but
>>>       
>> I just found a server
>>     
>>> which gave me the exact same error message you
>>>       
>> got. That server it turns
>>     
>>> out runs Java under a different user account to
>>>       
>> the one I was logged
>>     
>>> into however.
>>>
>>> Try running the script from the exact same user
>>>       
>> account the JVM process
>>     
>>> is running from. Even running from root doesn't
>>>       
>> work didn't work for me
>>     
>>> on that server, it had to be exact same user
>>>       
>> account, which is
>>     
>>> surprising.
>>>
>>> By the way those tools are documented here:
>>>
>>>       
> http://java.sun.com/j2se/1.5.0/docs/tooldocs/share/jmap.html
>   
>>> and
>>>
>>>       
> http://java.sun.com/j2se/1.5.0/docs/tooldocs/share/jstack.html
>   
>>> -basically they're supposed to work on most
>>>       
>> platforms except Windows and
>>     
>>> Linux Itanium so unless you've got Itanium cpus it
>>>       
>> should work for you.
>>     
>>> On Wed, 2008-04-09 at 14:44 -0400, Joshua Szmajda
>>>       
>> wrote:
>>     
>>>   
>>>       
>>>> Hey Niall,
>>>>
>>>> Thanks for your script, but I'm getting these
>>>>         
>> errors:
>>     
>>>> ./capture-diagnostics.sh RemoteCacheServerFactory
>>>> Capturing diagnostics for Java process
>>>>         
>> "RemoteCacheServerFactory" (pid 
>>     
>>>> 2007)...
>>>> 2007: Unable to open socket file: target process
>>>>         
>> not responding or 
>>     
>>>> HotSpot VM not loaded
>>>> The -F option can be used when the target process
>>>>         
>> is not responding
>>     
>>>> 2007: Unable to open socket file: target process
>>>>         
>> not responding or 
>>     
>>>> HotSpot VM not loaded
>>>> The -F option can be used when the target process
>>>>         
>> is not responding
>>     
>>>> Saved diagnostics for "RemoteCacheServerFactory"
>>>>         
>> to 
>>     
>>>> "RemoteCacheServerFactory-diagnostics.txt"
>>>>
>>>> There must be something I'm missing when I'm
>>>>         
>> running the cache server. I 
>>     
>>>> noticed it uses the 'server' VM by default, maybe
>>>>         
>> these debug commands 
>>     
>>>> are only good for the client VM?
>>>>
>>>> Thanks!
>>>> -Josh
>>>>
>>>> Niall Gallagher wrote:
>>>>     
>>>>         
>>>>> Hi Josh,
>>>>>
>>>>> Can you modify your cron job to capture
>>>>>           
>> diagnostics before it restarts
>>     
>>>>> the cache server?
>>>>>
>>>>> Then you can post the diagnostics next time it
>>>>>           
>> happens. The script below
>>     
>>>>> will capture diagnostics for you. We use
>>>>>           
>> something like this in-house
>>     
>>>>> for troubleshooting (not specifically for JCS).
>>>>>
>>>>> You'll first have to run the JDK 'jps' command
>>>>>           
>> from either root, or the
>>     
>>>>> user account which runs your cache server
>>>>>           
>> instance. This gives you the
>>     
>>>>> "name" of your cache server JVM process, which
>>>>>           
>> you need to supply to the
>>     
>>>>> diagnostics script as command-line parameter.
>>>>>           
>> The script uses the name
>>     
>>>>> to attach to the relevant JVM process.
>>>>>
>>>>> I don't know what might be causing the problem
>>>>>           
>> for you. It could be a
>>     
>>>>> bug in JCS, or it could be a memory issue. The
>>>>>           
>> diagnostics will help
>>     
>>>>> identify the problem.
>>>>>
>>>>> Save this as "capture-diagnostics.sh"...
>>>>> -------
>>>>> #!/bin/sh
>>>>> # Saves the stack traces and class memory usage
>>>>>           
>> information for a
>>     
>>>>> # Java process running on the machine to a
>>>>>           
>> diagnostics file.
>>     
>>>>> #
>>>>> # This script expects the name of the relevant
>>>>>           
>> Java process to be
>>     
>>>>> # specified as a parameter. The name specified
>>>>>           
>> should match a Java
>>     
>>>>> # process name as listed by running the JDK
>>>>>           
>> 'jps' command.
>>     
>>>>> #
>>>>> # Usage: sh capture-diagnostics.sh <name of
>>>>>           
>> process>
>>     
>>>>> APP_NAME="$1"
>>>>> JDK_LOCATION="/usr/java/default"
>>>>> DUMP_FILE="$APP_NAME-diagnostics.txt"
>>>>>
>>>>> APP_PID="`$JDK_LOCATION/bin/jps|grep $APP_NAME
>>>>>           
>> 2> /dev/null|cut -d\
>>     
>>>>> -f1`"
>>>>> if [ "$APP_PID" = "" ]; then
>>>>> echo "ERROR: Can't determine pid of Java process
>>>>>           
>> name specified
>>     
>>>>> \"$APP_NAME\""
>>>>> echo "Usage: sh capture-diagnostics.sh <name of
>>>>>           
>> process as listed by jps
>>     
>>>>> command>"
>>>>> exit 20
>>>>> fi
>>>>> echo "Capturing diagnostics for Java process
>>>>>           
>> \"$APP_NAME\" (pid
>>     
>>>>> $APP_PID)..."
>>>>> echo -e "Diagnostics for Java process
>>>>>           
>> \"$APP_NAME\" (pid $APP_PID) as at
>>     
>>>>> `date`:-" >> $DUMP_FILE
>>>>> echo -e "\nTop 30 memory-consuming classes:-" >>
>>>>>           
>> $DUMP_FILE
>>     
>>>>> $JDK_LOCATION/bin/jmap -histo:live $APP_PID
>>>>>           
>> |head -n33 >> $DUMP_FILE
>>     
>>>>> echo -e "\nThread stack traces:-" >> $DUMP_FILE
>>>>> $JDK_LOCATION/bin/jstack $APP_PID >> $DUMP_FILE
>>>>> echo -e "\n" >> $DUMP_FILE
>>>>> echo "Saved diagnostics for \"$APP_NAME\" to
>>>>>           
>> \"$DUMP_FILE\""
>>     
>>>>> -------
>>>>>
>>>>>
>>>>> On Wed, 2008-04-09 at 10:11 -0400, Joshua
>>>>>           
>> Szmajda wrote:
>>     
>>>>>   
>>>>>       
>>>>>           
>>>>>> Hey all,
>>>>>>
>>>>>> I've got a JCS remote cache server running on a
>>>>>>             
>> machine and every now 
>>     
>>>>>> and then it will spiral out of control and lock
>>>>>>             
>> the machine. I have no 
>>     
>>>>>> idea yet what's causing this, I've just put
>>>>>>             
>> some extra measures in place 
>>     
>>>>>> to capture the logs from when it happens. My
>>>>>>             
>> solution at this point is a 
>>     
>>>>>> cron job that checks now and then for excessive
>>>>>>             
>> cpu usage and restarts 
>>     
>>>>>> the cache server. I'd like to be able to not
>>>>>>             
>> worry about it, though :).
>>     
>>>>>> Any suggestions?
>>>>>>
>>>>>> Thanks!
>>>>>> -Josh
>>>>>>
>>>>>> P.S. it's running on ubuntu-server (kernel
>>>>>>             
>> 2.6.22-14-server).
>>     
>>>>>> I have up to 16 remote listeners connecting to
>>>>>>             
>> any given region. 
>>     
>>>>>> (probably 20 application instances in all).
>>>>>> Puts grow at a rate of about 400 per second.
>>>>>> I pass these options to java: "-Xms128m
>>>>>>             
>> -Xmx2000m"
>>     
>>>>>> And here's my simple remote.cache.ccf:
>>>>>>             
> === message truncated ===
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: jcs-users-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: jcs-users-help@jakarta.apache.org
>
>   

---------------------------------------------------------------------
To unsubscribe, e-mail: jcs-users-unsubscribe@jakarta.apache.org
For additional commands, e-mail: jcs-users-help@jakarta.apache.org


Mime
View raw message