db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anders Morken (JIRA)" <derby-...@db.apache.org>
Subject [jira] Commented: (DERBY-801) Allow parallel access to data files.
Date Tue, 30 May 2006 22:31:31 GMT
    [ http://issues.apache.org/jira/browse/DERBY-801?page=comments#action_12413925 ] 

Anders Morken commented on DERBY-801:

Ooh, very cool to see those numbers. Thanks for testing it despite my
pessimism. I probably haven't been running the benchmark I found
floating around the net on a big enough scale - while I have generated a
16.5 GB database, I haven't been patient enough to run the tests
properly. =)

Off the top of my head, things that need fixing are:

1) The class loading/initialization tricks in BaseDataFileFactory -
there's no need to do all that reflection every time we open a
container. Could probably be done in a static initializer, the boot
method or a constructor.  (Which one is appropriate? The boot method?)

2) A couple of hackish casts in the wrapping methods that retrieve the
FileChannel object when the Container's identity is set. Dunno if these
should be left in or we should change the StorageRandomAccessFile
interface to extend java.io.RandomAccessFile? Both the two
implementations are extensions of java.io.RandomAccessFile, so the "this
cast works" assumption is pretty safe (as well as defensively
implemented) now, but assumption is the mother of all **ck-ups? =)

3) Handling exceptions from FileChannel properly. The current code
handles IOExceptions by padding the file and trying again. I have no
idea if the pad-the-file trick is of any use at all with FileChannel -
it was simply retained from the original implementation. Maybe padFile
should be refitted for FileChannel as well?

3.5) There's probably a bug in the original implementation of
RAFContainer#writePage(): If the catch(IOException e) {...try again...}
path is executed, updatePageArray() is not called, so modifications such
as adding the container header to the first page will be done (unless it
was done before the IOException was thrown) - and perhaps a security
issue: the page written will not be encrypted. The fact that this hasn't
been discovered by encryption tests is probably an indicator that this
codepath doesn't succeed where the first attempt failed very often.
Anyway, I'll make a separate Jira issue for this.

4) Skip more synchronization? Low priority, but I think there's a few
cases where one or more synchronizations could be merged into one block
or removed altogether - but thread safety is a delicate matter.

5) And last but not least, anything else code review turns up, of
course. =)

I'll see if I have some time to work on this later this week. Thanks for
the help, Øystein. =)

> Allow parallel access to data files.
> ------------------------------------
>          Key: DERBY-801
>          URL: http://issues.apache.org/jira/browse/DERBY-801
>      Project: Derby
>         Type: Improvement

>   Components: Performance, Store
>     Versions:,,,,,,
>  Environment: Any
>     Reporter: Øystein Grøvlen
>  Attachments: NIO-RAFContainer-v1.patch
> Derby currently serializes accesses to a data file.  For example, the
> implementation of RAFContainer.readPage is as follows:
>     synchronized (this) {  // 'this' is a FileContainer, i.e. a file object
>         fileData.seek(pageOffset);  // fileData is a RandomAccessFile
>         fileData.readFully(pageData, 0, pageSize);
>     }
> I have experiemented with a patch where I have introduced several file
> descriptors (RandomAccessFile objects) per RAFContainer.  These are
> used for reading.  The principle is that when all readers are busy, a
> readPage request will create a new reader.  (There is a maximum number
> of readers.)  With this patch, throughput was improved by 50% on
> linux.  For more discussion on this, see
> http://www.nabble.com/Derby-I-O-issues-during-checkpointing-t473523.html
> The challenge with the suggested approach is to make a mechanism to
> limit the number of open file descpriptors.  Mike Matrigali has
> suggested to use the existing CacheManager infrastructure for this
> purpose.  For a discussion on that, see:
> http://www.nabble.com/new-uses-for-basic-services-cache---looking-for-advice-t756863.html

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:

View raw message