db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Matrigali (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
Date Sat, 12 Mar 2011 00:14:59 GMT

    [ https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005899#comment-13005899
] 

Mike Matrigali edited comment on DERBY-5108 at 3/12/11 12:14 AM:
-----------------------------------------------------------------

The more I look at this issue I think the problem is that the istat daemon should shutdown
and not return until it has completed this 
shutdown when indexRefresher.stop(); is called from the DataDictionary's stop call().  For
a clean shutdown of the system the store
needs all it's clients shutdown first and then it can cleanly shutdown, and force the database
files and transaction logs insuring 
a clean shutdown with no recovery work necessary on the next boot.

By leaving the istat daemon running we can run into a number of errors that I don't think
can be solved.  We might fix a specific one shown
up by this test but the system is just not designed to handle clean shutdown while stuff is
still running without first waiting for the running
stuff to stop somehow.

Kristian noted in DERBY-5037:
> I think Mike's comments/observations above agree pretty much with my thinking when writing
the code. Seems there are several error-handling issues to iron out though...
>A few specific comments:
>o I decided to not make Derby wait for the background thread to finish on shutdown, as
it might potentially be scanning a very large table.
>o Logging is rather verbose now during testing, but I agree it should be less verbose
(or maybe turned off completely) when released.
>o I'm logging a lot of exceptions to aid testing/debugging. These should also go away,
or be enabled by a property if the user wishes to do so. 

I now think that it was wrong to not wait for the background thread.  This would match the
behavior of the rawStoreDaemon thread which is "owned"
by the raw store module - the module stops the daemon and the daemon waits around for work
to stop/complete before returning from the stop, and
then the raw store continues with it's data and transaction file cleanup prior to stopping.
  I agree it would be a nice optimization to somehow stop the background thread in
the middle of a big scan, and it seems like with the better interrupt support this should
be much easier than was the case before 10.8.   I would like
some feedback before proceding from those more knowledgeable about the istat work.

I do think that the work rick did for DERBY-5037 is still valuable as it will handle much
better the non-clean shutdowns that Derby can experience.  ButFor
a non-clean shutdown we might have to just live with a file left open until the thread or
jvm exits.   But for a requested orderly shutdown of the system I 
think we should go with the top down shutdown supported by the architecture rather than try
to fix errors encountered when top level modules are still
running while lower level modules are trying to shut down.

      was (Author: mikem):
    The more I look at this issue I think the problem is that the istat daemon should shutdown
and not return until it has completed this 
shutdown when indexRefresher.stop(); is called from the DataDictionary's stop call().  For
a clean shutdown of the system the store
needs all it's clients shutdown first and then it can cleanly shutdown, and force the database
files and transaction logs insuring 
a clean shutdown with no recovery work necessary on the next boot.

By leaving the istat daemon running we can run into a number of errors that I don't think
can be solved.  We might fix a specific one shown
up by this test but the system is just not designed to handle clean shutdown while stuff is
still running without first waiting for the running
stuff to stop somehow.

Kristian noted in DERBY-5037:
> I think Mike's comments/observations above agree pretty much with my thinking when writing
the code. Seems there are several error-handling issues to iron out though...
>A few specific comments:
>o I decided to not make Derby wait for the background thread to finish on shutdown, as
it might potentially be scanning a very large table.
>o Logging is rather verbose now during testing, but I agree it should be less verbose
(or maybe turned off completely) when released.
>o I'm logging a lot of exceptions to aid testing/debugging. These should also go away,
or be enabled by a property if the user wishes to do so. 

I now think that it was wrong to not wait for the background thread.  I agree it would be
a nice optimization to somehow stop the background thread in
the middle of a big scan, and it seems like with the better interrupt support this should
be much easier than was the case before 10.8.   I would like
some feedback before proceding from those more knowledgeable about the istat work.

I do think that the work rick did for DERBY-5037 is still valuable as it will handle much
better the non-clean shutdowns that Derby can experience.  ButFor
a non-clean shutdown we might have to just live with a file left open until the thread or
jvm exits.   But for a requested orderly shutdown of the system I 
think we should go with the top down shutdown supported by the architecture rather than try
to fix errors encountered when top level modules are still
running while lower level modules are trying to shut down.
  
> Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete
on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Mike Matrigali
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete fails intermittently
on Windows platforms because the test is unable to delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 2000, resulting
in a total sleep time of 12 seconds), the conglomerate system\singleUse\copyShutdown\seg0\c481.dat
cannot be deleted.
> For instance from http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt
:
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 files/dirs
behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 1=system\singleUse\copyShutdown\seg0
2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat
1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat
1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat
1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes a file to
be reopened when it shouldn't have been (i.e. after the database shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message