db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Matrigali <mikem_...@sbcglobal.net>
Subject Re: [jira] Created: (DERBY-801) Allow parallel access to data files.
Date Tue, 30 May 2006 22:47:41 GMT
I have not reviewed the code, but your description of the approach and
the results posted later indicate this is a good path for this problem.
I would not worry about pre-1.4 jvm's not getting the benefit.  As long
as the system still works then the separate module appoach is consistent
with other parts of the system.

In addition to showing that the fix helps the dual i/o case, it would
be nice to compare the pre and post performance of single user I/O just
to insure the new interfaces are not significantly slower than the old.

Anders Morken wrote:
> Øystein Grøvlen (JIRA):
>>Allow parallel access to data files.
>>         Key: DERBY-801
>>         URL: http://issues.apache.org/jira/browse/DERBY-801
>>     Project: Derby
>>        Type: Improvement
>>  Components: Performance, Store  
>>    Versions:,,,,,,
>> Environment: Any >     Reporter: Øystein Grøvlen
>>Derby currently serializes accesses to a data file.  For example, the
>>implementation of RAFContainer.readPage is as follows:
>>    synchronized (this) {  // 'this' is a FileContainer, i.e. a file object
>>        fileData.seek(pageOffset);  // fileData is a RandomAccessFile
>>        fileData.readFully(pageData, 0, pageSize);
>>    }
>>I have experiemented with a patch where I have introduced several file
>>descriptors (RandomAccessFile objects) per RAFContainer.  These are
>>used for reading.  The principle is that when all readers are busy, a
>>readPage request will create a new reader.  (There is a maximum number
>>of readers.)  With this patch, throughput was improved by 50% on
>>linux.  For more discussion on this, see
>>The challenge with the suggested approach is to make a mechanism to
>>limit the number of open file descpriptors.  Mike Matrigali has
>>suggested to use the existing CacheManager infrastructure for this
>>purpose.  For a discussion on that, see:
> I've played around a bit with a different approach - using the
> FileChannel class from Java 1.4's new IO API. I've written a class
> RAFContainer4 which extends RAFContainer and overrides the readPage and
> writePage methods of that class to use read/write(ByteBuffer buf, long
> postition) in FileChannel to access the container's file, without
> synchronizing on the FileContainer during the read and write calls.
> With a bit of hackery in BaseDataFileFactory#newContainerObject() this
> class is then used instead of the regular RAFContainer on creation of
> new RAFContainer objects when Derby runs in a 1.4+ JVM.
> This approach gives the JVM and OS the opportunity to issue multiple
> file operations concurrently, although we have no guarantees that this
> will actually happen. This is JVM/OS dependent, but stracing the Sun
> 1.4.2_09 VM on Linux 2.6 shows that the VM now uses pread64()/pwrite64()
> system calls instead of seek(), read() and write(). pread and pwrite
> have similar semantics to the FileChannel#read/write(ByteBuffer buf,
> long position) methods, and do not alter the file's seek() position, and
> are supposed to be thread safe.
> Of course only people running Derby on 1.4+ JVMs will have the
> opportunity to benefit from this approach. As support for 1.3 is to be
> deprecated this might not be much of an issue?
> But anyway, I would like to see if this hack of mine actually works. I
> see mentions of a "TPC-B like benchmark" in the threads Øystein links to
> above, and wonder if that is something Sun internal, or if it's a
> publicly available benchmark implementation that I can get my grubby
> little paws on and try out this patch with? =)
> Thanks,

View raw message