db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bergquist, Brett" <BBergqu...@canoga.com>
Subject RE: [jira] [Commented] (DERBY-6510) Deby engine threads not making progress
Date Fri, 14 Mar 2014 20:20:53 GMT
Mike, I am going to spend some time investigating the code.  From you developers help I think
I have satisfied it to myself that there is some condition where there a looping or something
similar and will investigate the code in this area.

I much appreciate the help that you have given.


-----Original Message-----
From: Mike Matrigali (JIRA) [mailto:jira@apache.org] 
Sent: Friday, March 14, 2014 4:05 PM
To: derby-dev@db.apache.org
Subject: [jira] [Commented] (DERBY-6510) Deby engine threads not making progress

    [ https://issues.apache.org/jira/browse/DERBY-6510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13935554#comment-13935554

Mike Matrigali commented on DERBY-6510:

I think you have a reasonable theory based on evidence so far.  I think I might look at the
code with an eye to if there are any shared data structures that might being changed during
the optmization process by multiple threads because of some bug in coordinating access.

You mentioned multiple threads waiting on same query.  And from prstat looks like there are
at least
46 cpu's on that machine, so likely very concurrent if they happen to all find the plan not
in the cache.
though would be easier to explain if somehow 2 or more threads "got through" and were working
at the same time, somehow messing each otherup.

I have not worked much on the optimizer code, but I would start by looking for whatever data
structure is at the top of the loop that stores the info about progress through all the possibilites,
cost so far, ...
And them maybe look if it could be accessed by more than one thread if some logic were wrong.

there is some info on wiki, here is one pointer:

I also sometimes just look up old fixed JIRA's by "A B", army was great on documenting existing
behavior while making changes to optimizer.

> Deby engine threads not making progress
> ---------------------------------------
>                 Key: DERBY-6510
>                 URL: https://issues.apache.org/jira/browse/DERBY-6510
>             Project: Derby
>          Issue Type: Bug
>          Components: Network Server
>    Affects Versions:
>         Environment: Oracle Solaris 10/9, Oracle M5000 32 CPU, 128GB memory, 8GB allocated
to Derby Network Server
>            Reporter: Brett Bergquist
>            Priority: Critical
>         Attachments: dbstate.log, derbystacktrace.txt, prstat.log, 
> prstat_normal.log, queryplan.txt, queryplan_nooptimizerTimeout.txt
> We had an issue today in a production environment at a large customer site.   Basically
5 database interactions became stuck and are not progressing.   Part of the system dump performs
a stack trace every few seconds for a period of a minute on the Glassfish application server
and the Derby database engine (running in network server mode).   Also, the dump captures
the current transactions and the current lock table (ie. syscs_diag.transactions and syscs_diag.lock_table).
  We had to restart the system and in doing so, the Derby database engine would not shutdown
and had to be killed.
> The stack traces of the Derby engine show 5 threads that are basically making no progress
in that at each sample, they are at the same point, waiting.
> I will attach the stack traces as well as the state of the transactions and locks.  

> Interesting is that the "derby.jdbc.xaTransactionTimeout =1800" is set, yet the transactions
did not timeout.  The timeout is for 30 minutes but the transactions were in process for hours.

This message was sent by Atlassian JIRA
View raw message