db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dag H. Wanvik (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-4741) Make Derby work reliably in the presence of thread interrupts
Date Thu, 11 Nov 2010 23:22:31 GMT

    [ https://issues.apache.org/jira/browse/DERBY-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931263#action_12931263
] 

Dag H. Wanvik commented on DERBY-4741:
--------------------------------------

Thanks for the review, Knut. Uploading a new version of the patch,
derby-4741-b-02-nio, details below, rerunning regressions.

- In RAFContainer4, folded the cases for pageNumber == -1 (call from
  getEmbryonicPage) with the case Thread.holdsLock(this). This allows
  awaitRestoreChannel to throw the retry exception for all cases. I
  did both static and dynamic analysis to establish invariant, and
  added a sane check for it.

> As far as I can see, the code will behave the same way as before if no
> interrupt is detected. So if there are problems with the code, they
> will hopefully be limited to the case where the thread is
> interrupted. There's an additional container-wide synchronization on
> every page read/write, though, which may possibly have a negative
> effect on performance in multi-threaded environments. If that turns
> out to be an issue, would it be possible to change the code to only
> check restoreChannelInProgress after an exception has been thrown,
> similar to what we currently do in stealth mode?

Yes, I believe we could do that, since it should do no harm to attempt
the IO even when recovery is in progress, because all calls to
getChannel are synchronized on "this" already, so either the thread
sees the old channel (closed), or the new reopened channel. I didn't
make this change yet, though, since the monitor hold should
short-lived (one boolean check and an integer increment) compared to
the IO.

- Thread.holdsLock(): done
- Created a new subclass of StandardException:
  InterruptDetectedException, good suggestion, indeed cleaner!
- Javadoc for RAFContainer4.readPage: done
- The variable "whence": removed
- Removed some commented out debugging cruft
- Tuned the number of iterations in InterruptDetectedException to make
  sure we see a a concurrent thread (RawStoreDaemon) having to wait
  for cleanup before proceeding, at least on my box. Cf. the debug
  trace for derby.debug.true=RAF4Recovery, which I also added.


> Make Derby work reliably in the presence of thread interrupts
> -------------------------------------------------------------
>
>                 Key: DERBY-4741
>                 URL: https://issues.apache.org/jira/browse/DERBY-4741
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.2.1.6, 10.2.2.0, 10.3.1.4, 10.3.2.1, 10.3.3.0, 10.4.1.3, 10.4.2.0,
10.5.1.1, 10.5.2.0, 10.5.3.0, 10.6.1.0
>            Reporter: Dag H. Wanvik
>            Assignee: Dag H. Wanvik
>         Attachments: derby-4741-a-01-api-interruptstatus.diff, derby-4741-a-01-api-interruptstatus.stat,
derby-4741-a-02-api-interruptstatus.diff, derby-4741-a-02-api-interruptstatus.stat, derby-4741-a-03-api-interruptstatus.diff,
derby-4741-a-03-api-interruptstatus.stat, derby-4741-a-04-api-interruptstatus.diff, derby-4741-a-04-api-interruptstatus.stat,
derby-4741-all+lenient+resurrect.diff, derby-4741-all+lenient+resurrect.stat, derby-4741-b-01-nio.diff,
derby-4741-b-01-nio.stat, derby-4741-nio-container+log+waits+locks+throws.diff, derby-4741-nio-container+log+waits+locks+throws.stat,
derby-4741-nio-container+log+waits+locks-2.diff, derby-4741-nio-container+log+waits+locks-2.stat,
derby-4741-nio-container+log+waits+locks.diff, derby-4741-nio-container+log+waits+locks.stat,
derby-4741-nio-container+log+waits.diff, derby-4741-nio-container+log+waits.stat, derby-4741-nio-container+log.diff,
derby-4741-nio-container+log.stat, derby-4741-nio-container-2.diff, derby-4741-nio-container-2.log,
derby-4741-nio-container-2.stat, derby-4741-nio-container-2b.diff, derby-4741-nio-container-2b.stat,
derby.log, derby.log, InterruptResilienceTest.java, MicroAPITest.java, xsbt0.log.gz
>
>
> When not executing on a small device VM, Derby has been using the Java NIO classes java.nio.clannel.*
for file io.
> If thread is interrupted while executing blocking IO operations in NIO, the ClosedByInterruptException
will get thrown. Unfortunately, Derby isn't current architected to retry and complete such
operations (before passing on the interrupt), so the Derby database can be left in an inconsistent
state and we therefore have to return a database level error. This means the applications
can no longer access the database without a shutdown and reboot including a recovery.
> It would be nice if Derby could somehow detect and finish IO operations underway when
thread interrupts happen before passing the exception on to the application. Derby embedded
is sometimes embedded in applications that use Thread.interrupt to stop threads.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message