db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Katherine Marsden <kmarsdende...@sbcglobal.net>
Subject Re: Could use some thoughts on a problem with a query in production
Date Mon, 05 Jun 2017 19:39:22 GMT
Hi Brett,

I just happened to see your post on my phone and what came to mind was I wonder how the automatic
statistics update is implemented and if that might somehow be at play here.  I may be totally
off base but remember at one point this was implemented and there was discussion of how and
when to do it and concern that at some point, some query was going to get hit with the cost
(for the team so to speak) . It might be worthwhile looking at what triggers the statistics
update and if it might be at play here.

Thanks

Katherine

Sent from my iPhone

> On Jun 5, 2017, at 12:04 PM, Bergquist, Brett <BBergquist@canoga.com> wrote:
> 
> Background:
>  
> ·         A large database approximately 750G
> ·         derby.storage.pageCacheSize=64000
> ·         Inserts going into the database about 125/second
> ·         Other database updates and deletes are being performed at a lower rate.
>  
> A query is run by the customer that does not complete in a reasonable amount of time.
  Unfortunately the customer’s program retries this query up to 40 times compounding the
problem. 
>  
> The query retrieves a set of records for a period of time.  The query is constructed
using the indexes of the table such that the rows should be in sorted order and an external
sort not required.   Basically the query fires off periodically and requests the records for
a specific hour of the day.   The next time it fires off and retrieves the next hour, etc.
>  
> The query has run successfully with other parameters repeatedly through out the week
with no issues.  Only one set of query parameters caused the issue.   The same problem occurred
exactly one week later again for the same time period of the day.
>  
> I was able to retrieve the database locally from the customer and I configured my server
to time and time zone and ran the query with no issue.   So it does not appear to be an optimizer
problem with it choosing the wrong path.
>  
> There is a script that is run at the customer’s site to retrieve some data that I use
for debugging these types of problems as access to the customer site is not possible due to
security reasons.   One of the things that is captures is stack traces every 5 seconds for
1 minute.
>  
> The stack traces show no deadlocks and also seem to show that the query path is not doing
a large external sort or anything like that.   Many of the threads are blocked however and
they seem to be blocked at a low level.    For example many threads look like this:
>  
> "DRDAConnThread_98" #123 prio=5 os_prio=64 tid=0x00000001027b3800 nid=0x94 waiting for
monitor entry [0xfffffffd37afe000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>                 at org.apache.derby.impl.store.raw.data.FileContainer.pageValid(Unknown
Source)
>                 - waiting to lock <0xfffffffd70d5d598> (a org.apache.derby.impl.store.raw.data.AllocationCache)
>                 at org.apache.derby.impl.store.raw.data.FileContainer.getUserPage(Unknown
Source)
>                 at org.apache.derby.impl.store.raw.data.FileContainer.getPage(Unknown
Source)
>                 at org.apache.derby.impl.store.raw.data.BaseContainerHandle.getPage(Unknown
Source)
>                 at org.apache.derby.impl.store.access.conglomerate.OpenConglomerate.latchPage(Unknown
Source)
>                 at org.apache.derby.impl.store.access.conglomerate.GenericConglomerateController.fetch(Unknown
Source)
>                 at org.apache.derby.impl.sql.execute.IndexRowToBaseRowResultSet.getNextRowCore(Unknown
Source)
>                 at org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore(Unknown
Source)
>                 at org.apache.derby.impl.sql.execute.NestedLoopJoinResultSet.getNextRowCore(Unknown
Source)
>                 at org.apache.derby.impl.sql.execute.NestedLoopJoinResultSet.getNextRowCore(Unknown
Source)
>                 at org.apache.derby.impl.sql.execute.NestedLoopJoinResultSet.getNextRowCore(Unknown
Source)
>                 at org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore(Unknown
Source)
>                 at org.apache.derby.impl.sql.execute.SortResultSet.getRowFromResultSet(Unknown
Source)
>                 at org.apache.derby.impl.sql.execute.SortResultSet.getNextRowFromRS(Unknown
Source)
>                 at org.apache.derby.impl.sql.execute.SortResultSet.loadSorter(Unknown
Source)
>                 at org.apache.derby.impl.sql.execute.SortResultSet.openCore(Unknown Source)
>                 at org.apache.derby.impl.sql.execute.BasicNoPutResultSetImpl.open(Unknown
Source)
>                 at org.apache.derby.impl.sql.GenericPreparedStatement.executeStmt(Unknown
Source)
>                 at org.apache.derby.impl.sql.GenericPreparedStatement.execute(Unknown
Source)
>                 at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown
Source)
>                 - locked <0xfffffffd95a59900> (a org.apache.derby.impl.jdbc.EmbedConnection40)
>                 at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown
Source)
>                 at org.apache.derby.impl.jdbc.EmbedPreparedStatement.execute(Unknown
Source)
>                 at org.apache.derby.iapi.jdbc.BrokeredPreparedStatement.execute(Unknown
Source)
>                 at org.apache.derby.impl.drda.DRDAStatement.execute(Unknown Source)
>                 at org.apache.derby.impl.drda.DRDAConnThread.processCommands(Unknown
Source)
>                 at org.apache.derby.impl.drda.DRDAConnThread.run(Unknown Source)
>  
> and they are waiting on a thread that looks like this:
>  
> "DRDAConnThread_61" #86 prio=5 os_prio=64 tid=0x0000000102d4c000 nid=0x6f waiting for
monitor entry [0xfffffffd3c7fe000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>                 at org.apache.derby.impl.store.raw.data.FileContainer.pageValid(Unknown
Source)
>                 - locked <0xfffffffd70d5d598> (a org.apache.derby.impl.store.raw.data.AllocationCache)
>                 at org.apache.derby.impl.store.raw.data.FileContainer.getUserPage(Unknown
Source)
>                 at org.apache.derby.impl.store.raw.data.FileContainer.getPage(Unknown
Source)
>                 at org.apache.derby.impl.store.raw.data.BaseContainerHandle.getPage(Unknown
Source)
>                 at org.apache.derby.impl.store.access.conglomerate.OpenConglomerate.latchPage(Unknown
Source)
>                 at org.apache.derby.impl.store.access.conglomerate.GenericConglomerateController.fetch(Unknown
Source)
>                 at org.apache.derby.impl.sql.execute.IndexRowToBaseRowResultSet.getNextRowCore(Unknown
Source)
>                 at org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore(Unknown
Source)
>                 at org.apache.derby.impl.sql.execute.NestedLoopJoinResultSet.getNextRowCore(Unknown
Source)
>                 at org.apache.derby.impl.sql.execute.NestedLoopJoinResultSet.getNextRowCore(Unknown
Source)
>                 at org.apache.derby.impl.sql.execute.NestedLoopJoinResultSet.getNextRowCore(Unknown
Source)
>                 at org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore(Unknown
Source)
>                 at org.apache.derby.impl.sql.execute.SortResultSet.getRowFromResultSet(Unknown
Source)
>                 at org.apache.derby.impl.sql.execute.SortResultSet.getNextRowFromRS(Unknown
Source)
>                 at org.apache.derby.impl.sql.execute.SortResultSet.loadSorter(Unknown
Source)
>                 at org.apache.derby.impl.sql.execute.SortResultSet.openCore(Unknown Source)
>                 at org.apache.derby.impl.sql.execute.BasicNoPutResultSetImpl.open(Unknown
Source)
>                 at org.apache.derby.impl.sql.GenericPreparedStatement.executeStmt(Unknown
Source)
>                 at org.apache.derby.impl.sql.GenericPreparedStatement.execute(Unknown
Source)
>                 at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown
Source)
>                 - locked <0xfffffffd90645fe8> (a org.apache.derby.impl.jdbc.EmbedConnection40)
>                 at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown
Source)
>                 at org.apache.derby.impl.jdbc.EmbedPreparedStatement.execute(Unknown
Source)
>                 at org.apache.derby.iapi.jdbc.BrokeredPreparedStatement.execute(Unknown
Source)
>                 at org.apache.derby.impl.drda.DRDAStatement.execute(Unknown Source)
>                 at org.apache.derby.impl.drda.DRDAConnThread.processCommands(Unknown
Source)
>                 at org.apache.derby.impl.drda.DRDAConnThread.run(Unknown Source)
>  
> In another stack trace capture  I see a thread as well:
>  
> "derby.rawStoreDaemon" #25 daemon prio=5 os_prio=64 tid=0x0000000100b57800 nid=0x32 runnable
[0xfffffffd551fe000]
>    java.lang.Thread.State: RUNNABLE
>                 at java.io.FileDescriptor.sync(Native Method)
>                 at org.apache.derby.impl.io.DirRandomAccessFile.sync(Unknown Source)
>                 at org.apache.derby.impl.store.raw.data.RAFContainer.writeRAFHeader(Unknown
Source)
>                 at org.apache.derby.impl.store.raw.data.RAFContainer.clean(Unknown Source)
>                 - locked <0xffffffff56bf1a18> (a org.apache.derby.impl.store.raw.data.RAFContainer4)
>                 at org.apache.derby.impl.services.cache.ConcurrentCache.cleanAndUnkeepEntry(Unknown
Source)
>                 at org.apache.derby.impl.services.cache.ConcurrentCache.cleanEntry(Unknown
Source)
>                 at org.apache.derby.impl.services.cache.BackgroundCleaner.performWork(Unknown
Source)
>                 at org.apache.derby.impl.services.daemon.BasicDaemon.serviceClient(Unknown
Source)
>                 at org.apache.derby.impl.services.daemon.BasicDaemon.work(Unknown Source)
>                 at org.apache.derby.impl.services.daemon.BasicDaemon.run(Unknown Source)
>                 at java.lang.Thread.run(Thread.java:745)
>  
> That one caught my attention because of the cache that appeared being “cleaned”.
 
>  
> I don’t see any errors in the derby log nor the application log.   The CPU utilization
of the system goes way up when this is occurring however (around 45% utilization by Derby).
  So Derby is trying to do something but I cannot tell what.   While the utilization is up,
other queries but with different time periods are completing and the insertions and other
database updates are also working.
>  
> Every one of the 40 threads relating to the queries that the customer retried seem to
be in this same call stack pattern.   It did appear that after hours of execution, some of
these threads did progress to completion, however the system had to be restarted to completely
clear the system. 
>  
> One week later after the restart and about 100 such queries and the same issue occurred.
  Again the same query for other time periods worked fine even during the issue.
>  
> So any ideas on where to look will be much appreciated.  
> 
> Canoga Perkins
> 20600 Prairie Street
> Chatsworth, CA 91311
> (818) 718-6300
> 
> This e-mail and any attached document(s) is confidential and is intended only for the
review of the party to whom it is addressed. If you have received this transmission in error,
please notify the sender immediately and discard the original message and any attachment(s).

Mime
View raw message