db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Matrigali (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
Date Fri, 11 Mar 2011 22:23:59 GMT

    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005876#comment-13005876
] 

Mike Matrigali commented on DERBY-5108:
---------------------------------------

It looks like one of the last things that happens in the istat thread prior to store peforming
it's shutdown and closing all of it's open files is that 
it recovers an open from an interrupt.  I'll include the stacks below that show this.  What
should we do if the istat thread get's an interrupt.  I am leaning
toward it should check for interrupts at critical places and then give up on the work it is
doing.  Maybe there is a place it could check whether it should be
shutting down that needs to be checked more often?  

The store shutdown code only closes files that are not currently "kept" (store's term for
a file that is open by a store client), so not surprising that file is
not closed if it is executing while the istat thread is busy reading from the file).  Here
is the stack of the store closing it's open files:
    at org.apache.derby.impl.store.raw.data.RAFContainer.closeContainer(RAFContainer.java:216)^M
    at org.apache.derby.impl.store.raw.data.RAFContainer4.closeContainer(RAFContainer4.java:227)^M
    at org.apache.derby.impl.store.raw.data.FileContainer.clearIdentity(FileContainer.java:476)^M
    at org.apache.derby.impl.services.cache.ConcurrentCache.removeEntry(ConcurrentCache.java:167)^M
    at org.apache.derby.impl.services.cache.ConcurrentCache.ageOut(ConcurrentCache.java:583)^M
    at org.apache.derby.impl.services.cache.ConcurrentCache.shutdown(ConcurrentCache.java:598)^M
    at org.apache.derby.impl.store.raw.data.BaseDataFileFactory.stop(BaseDataFileFactory.java:519)^M
    at org.apache.derby.impl.services.monitor.TopService.stop(TopService.java:443)^M
    at org.apache.derby.impl.services.monitor.TopService.shutdown(TopService.java:394)^M
    at org.apache.derby.impl.services.monitor.BaseMonitor.shutdown(BaseMonitor.java:229)^M

Here is some debug showing the stacks where the istat daemon is recovering from an interrupt:
on the file of interest c481.dat - conglom/container  id  1153
Thu Mar 10 16:22:59 PST 2011 Thread[index-stat-thread,5,main] {istat,trace@637806084} worker
thread started (xid=24016) [q/p/s=1/0/1,err:k/u/c=0/0/0,rej:f/d/
/0/0]^M
Thu Mar 10 16:22:59 PST 2011 Thread[index-stat-thread,5,main] {istat,trace@637806084}    
processing "APP"."BIG_TABLE" ^M
DEBUG IndexStatisticsDaemonImpl OUTPUT: opened conglomerateNumber[1] = 1153^M
DEBUG RAFContainer OUTPUT: about to close file C:\derby\s1\systest\out\junit\system\singleUse\copyShutdown\seg0\c481.dat^M
Exception trace: ^M
java.lang.Throwable^M
    at org.apache.derby.impl.store.raw.data.RAFContainer.closeContainer(RAFContainer.java:216)^M
    at org.apache.derby.impl.store.raw.data.RAFContainer4.closeContainer(RAFContainer4.java:227)^M
    at org.apache.derby.impl.store.raw.data.RAFContainer4.recoverContainerAfterInterrupt(RAFContainer4.java:860)^M
    at org.apache.derby.impl.store.raw.data.RAFContainer4.readPage(RAFContainer4.java:368)^M
    at org.apache.derby.impl.store.raw.data.RAFContainer4.readPage(RAFContainer4.java:246)^M
    at org.apache.derby.impl.store.raw.data.CachedPage.readPage(CachedPage.java:673)^M
    at org.apache.derby.impl.store.raw.data.CachedPage.setIdentity(CachedPage.java:193)^M
    at org.apache.derby.impl.services.cache.ConcurrentCache.find(ConcurrentCache.java:295)^M
    at org.apache.derby.impl.store.raw.data.FileContainer.getUserPage(FileContainer.java:2530)^M
    at org.apache.derby.impl.store.raw.data.FileContainer.getPage(FileContainer.java:2580)^M
    at org.apache.derby.impl.store.raw.data.BaseContainerHandle.getPage(BaseContainerHandle.java:319)^M
    at org.apache.derby.impl.store.access.btree.ControlRow.get(ControlRow.java:833)^M
    at org.apache.derby.impl.store.access.btree.ControlRow.get(ControlRow.java:820)^M
    at org.apache.derby.impl.store.access.btree.ControlRow.getRightSibling(ControlRow.java:531)^M
    at org.apache.derby.impl.store.access.btree.BTreeScan.positionAtNextPage(BTreeScan.java:493)^M
    at org.apache.derby.impl.store.access.btree.BTreeForwardScan.fetchRows(BTreeForwardScan.java:464)^M
    at org.apache.derby.impl.store.access.btree.BTreeScan.fetchNextGroup(BTreeScan.java:1681)^M
    at org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl$KeyComparator.fetchRows(IndexStatisticsDaemonImpl.java:1174)^M
    at org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.updateIndexStatsMinion(IndexStatisticsDaemonImpl.java:470)^M
    at org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.generateStatistics(IndexStatisticsDaemonImpl.java:327)^M
    at org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.processingLoop(IndexStatisticsDaemonImpl.java:777)^M
    at org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.run(IndexStatisticsDaemonImpl.java:687)^M
    at java.lang.Thread.run(Thread.java:736)^M
DEBUG RAFContainer OUTPUT: closed file C:\derby\s1\systest\out\junit\system\singleUse\copyShutdown\seg0\c481.dat^M
DEBUG RAFContainer OUTPUT: opening file: C:\derby\s1\systest\out\junit\system\singleUse\copyShutdown\seg0\c481.dat^M
Exception trace: ^M
java.lang.Throwable^M
    at org.apache.derby.impl.store.raw.data.RAFContainer.run(RAFContainer.java:1587)^M
    at java.security.AccessController.doPrivileged(AccessController.java:251)^M
    at org.apache.derby.impl.store.raw.data.RAFContainer.openContainerMinion(RAFContainer.java:939)^M
    at org.apache.derby.impl.store.raw.data.RAFContainer.reopenContainer(RAFContainer.java:914)^M
    at org.apache.derby.impl.store.raw.data.RAFContainer4.recoverContainerAfterInterrupt(RAFContainer4.java:861)^M
    at org.apache.derby.impl.store.raw.data.RAFContainer4.readPage(RAFContainer4.java:368)^M
    at org.apache.derby.impl.store.raw.data.RAFContainer4.readPage(RAFContainer4.java:246)^M
    at org.apache.derby.impl.store.raw.data.CachedPage.readPage(CachedPage.java:673)^M
    at org.apache.derby.impl.store.raw.data.CachedPage.setIdentity(CachedPage.java:193)^M
    at org.apache.derby.impl.services.cache.ConcurrentCache.find(ConcurrentCache.java:295)^M
    at org.apache.derby.impl.store.raw.data.FileContainer.getUserPage(FileContainer.java:2530)^M
    at org.apache.derby.impl.store.raw.data.FileContainer.getPage(FileContainer.java:2580)^M
    at org.apache.derby.impl.store.raw.data.BaseContainerHandle.getPage(BaseContainerHandle.java:319)^M
    at org.apache.derby.impl.store.access.btree.ControlRow.get(ControlRow.java:833)^M
    at org.apache.derby.impl.store.access.btree.ControlRow.get(ControlRow.java:820)^M
    at org.apache.derby.impl.store.access.btree.ControlRow.getRightSibling(ControlRow.java:531)^M
    at org.apache.derby.impl.store.access.btree.BTreeScan.positionAtNextPage(BTreeScan.java:493)^M
    at org.apache.derby.impl.store.access.btree.BTreeForwardScan.fetchRows(BTreeForwardScan.java:464)^M
    at org.apache.derby.impl.store.access.btree.BTreeScan.fetchNextGroup(BTreeScan.java:1681)^M
    at org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl$KeyComparator.fetchRows(IndexStatisticsDaemonImpl.java:1174)^M
    at org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.updateIndexStatsMinion(IndexStatisticsDaemonImpl.java:470)^M
    at org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.generateStatistics(IndexStatisticsDaemonImpl.java:327)^M
    at org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.processingLoop(IndexStatisticsDaemonImpl.java:777)^M
    at org.apache.derby.impl.services.daemon.IndexStatisticsDaemonImpl.run(IndexStatisticsDaemonImpl.java:687)^M
    at java.lang.Thread.run(Thread.java:736)^M
----------------------------------------------------------------^M
----------------------------------------------------------------^M
Thu Mar 10 16:22:59 PST 2011:
Shutting down instance 35924132-012e-a249-3f92-ffffe527380d on database directory C:\derby\s1\systest\out\junit\system\singleUse\copyShutdown
with class loade
sun.misc.Launcher$AppClassLoader@49be49be ^M
shutting down:
isCorrupt = false
pageCache = org.apache.derby.impl.services.cache.ConcurrentCache@1c2a1c2a
containerCache = org.apache.derby.impl.services.cache.ConcurrentCache@20fa20fa^M

> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete
on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Mike Matrigali
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently
on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting
in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat
cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt
:
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs
behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0
2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat
1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat
1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat
1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to
be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message