db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bergquist, Brett" <BBergqu...@canoga.com>
Subject How to present a heap dump to the list - Was: Had Derby 10.8.2.2 fail today and need some thoughts
Date Thu, 01 Mar 2012 14:00:45 GMT
I have a couple of heap dumps that I can bring up in jvisualvm.    One is an 8Gb dump of where
Derby had an OOM when doing a large query and one is just a snapshot of the heap of a running
system that I see about 121K  instances of "org.apache.derby.jdbc.XATransactionState$CancelXATransactionTask".
  Jvisualvm is able to load these heap dumps but does not seem to have a good way of outputting
a report.  

I could do a screen capture and attach a JPEG, but I want to make sure this is proper etiquette
for the list before doing so.

So any advice on this will be appreciated.

Brett

-----Original Message-----
From: Bergquist, Brett [mailto:BBergquist@canoga.com] 
Sent: Tuesday, February 28, 2012 7:41 PM
To: derby-dev@db.apache.org
Subject: RE: Had Derby 10.8.2.2 fail today and need some thoughts

Hardware is Oracle M5000 running Solaris 10.   Database engine is controlled through SMF of
Solaris.   derby.properties is setup to always append to the log, never truncate.  There definitely
is not more than one process starting the Network Server.  

Here is another tidbit.   Even though the log says the Derby engine was shutting down, the
process was still alive (prstat showed this) and the SMF log also showed that the Network
Server process never terminated.

>From a previous capture of the "derby.log", it also appeared this happened on the Feb
24'th as well, but seemed to recover and continue running.   Strange?

On an separate thought in case it has to do with an out of memory condition, I was using jvisualvm
the other day and monitoring the heap/garbage collection on a test system that tries to mimic
this setup (Oracle M3000).   I did a heap dump an noticed many objects relating to XA transaction
timeout timers.   About 600K of these.   They seemed to be owned by java.util.TimerTask. 
 I remember seeing some Derby issue updated yesterday or the day before about java.util.TimerTask
holding on to some things.   I bring this up because recently we added the property to derby.properties
to cancel XA transactions if they take to long.   We probably do about 6M XA transactions
a day and I am wondering if maybe there is some sort of leak here.   I will try my setup and
look again tomorrow using jvisualvm.

I appreciate your thoughts.

Brett
________________________________________
From: Mike Matrigali [mikem_app@sbcglobal.net]
Sent: Tuesday, February 28, 2012 6:40 PM
To: derby-dev@db.apache.org
Subject: Re: Had Derby 10.8.2.2 fail today and need some thoughts

Don't know by more info might help.  A catch 22 is that usually the top of the derby.log usually
has all the environment info that could be useful.

Can you list OS, JVM.  I/O handling to the log is likely an OS thing.  What properties do
you have set, any special for error logging?

I would likely look first for multiple starts and stops of derby and a setting default for
one of them to truncate the derby.log.
Maybe leading to multiple processes trying to write to same derby.log.

/mikem

Bergquist, Brett wrote:
> Our customer called and said the server was not working.   Before
> restarting, I retrieved the derby.log and it seems strange because 
> this was right at the top of the log:
>
>
>
> ----------------------------------------------------------------
>
> Tue Feb 28 15:33:58 EST 2012: Shutting down Derby engine
>
> ----------------------------------------------------------------
>
> Tue Feb 28 15:37:28 EST 2012 Thread[DRDAConnThread_245,5,main] (XID = 
> 1482003981), (SESSIONID = 133257672), (DATABASE = csemdb), (DRDAID = 
> ????????.??-4471791624540787385{335696}), Cleanup action starting
>
> Tue Feb 28 15:37:28 EST 2012 Thread[DRDAConnThread_245,5,main] (XID = 
> 1482003981), (SESSIONID = 133257672), (DATABASE = csemdb), (DRDAID = 
> ????????.??-4471791624540787385{335696}), Failed Statement is: null
>
> java.lang.NullPointerException
>
> Cleanup action completed
>
>
>
> The strange part is "Shutting down Derby engine" is the first thing in
> the log.   The server was up and running since 2/16/2012 with no
> problems with about 60M transactions processed, a backup performed each
> night, etc.    And then this.
>
>
>
> So any thoughts on how the derby.log could be recreated and have this as
> the first thing in the log?   I am thinking maybe an OutOfMemory
> condition but all traces of whatever when wrong are gone, so I am 
> trying to work backwards and see how the derby.log could be created 
> with this as the first thing in the log.
>
>
>
> Thanks for any help!
>
>
>
> Brett
>
>
>
>
>






Mime
View raw message