db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oystein Grovlen - Sun Norway <Oystein.Grov...@Sun.COM>
Subject Re: [jira] Created: (DERBY-801) Allow parallel access to data files.
Date Wed, 24 May 2006 09:36:16 GMT
Anders Morken wrote:

> I've played around a bit with a different approach - using the
> FileChannel class from Java 1.4's new IO API. I've written a class
> RAFContainer4 which extends RAFContainer and overrides the readPage and
> writePage methods of that class to use read/write(ByteBuffer buf, long
> postition) in FileChannel to access the container's file, without
> synchronizing on the FileContainer during the read and write calls.
> 
> With a bit of hackery in BaseDataFileFactory#newContainerObject() this
> class is then used instead of the regular RAFContainer on creation of
> new RAFContainer objects when Derby runs in a 1.4+ JVM.
> 
> This approach gives the JVM and OS the opportunity to issue multiple
> file operations concurrently, although we have no guarantees that this
> will actually happen. This is JVM/OS dependent, but stracing the Sun
> 1.4.2_09 VM on Linux 2.6 shows that the VM now uses pread64()/pwrite64()
> system calls instead of seek(), read() and write(). pread and pwrite
> have similar semantics to the FileChannel#read/write(ByteBuffer buf,
> long position) methods, and do not alter the file's seek() position, and
> are supposed to be thread safe.

Great, Anders.  This looks like a promising idea.
> 
> Of course only people running Derby on 1.4+ JVMs will have the
> opportunity to benefit from this approach. As support for 1.3 is to be
> deprecated this might not be much of an issue?

If this means that 1.3 still works, but the old way, I think this is 
acceptable.

> But anyway, I would like to see if this hack of mine actually works. I
> see mentions of a "TPC-B like benchmark" in the threads Øystein links to
> above, and wonder if that is something Sun internal, or if it's a
> publicly available benchmark implementation that I can get my grubby
> little paws on and try out this patch with? =)

The actual code is something we have developed internally here, and I am 
not sure we will have time to make it available any time soon.  If you 
make a patch of your changes, I should be able to test this next week. 
If you want to try this out yourself, I think you should be able to make 
a sufficient test client quickly.  (However, it will take some time to 
create the large database).  What I used was:

1. A database much larger than physical memory on computer. I think I 
had around 17 GB of data including indexes.
2. A large page cache.  I used 500MB on a computer with 2GB RAM.
3. Log device on separate disk.  (I.e., you need a computer with 2 disks.)
4. I used TPC-B like transactions, but I would accept that any load 
where transactions access records in a large table by primary key should 
work.  Make sure to try to avoid frequent lock conflicts or deadlocks. 
(E.g., two random accesses to the same table within a transaction is 
dead-lock prone)
5. Multi-threaded application where all threads ran the same type of 
transaction back-to-back.  (I had 20 threads). Our application prints 
throughput per thread and total throughput for every 10 second interval 
and an average at the end.
6. Run for at least 30 mins to allow for several checkpoints to happen 
during your run.  (I ran for 1 hour).

A short description of our TPC-B like app:

4 tables:
branch(bid int, bbal int, junk char(92), primary key(bid))
teller(tid int, bid, int , tbal int, junk char(88), primary key(tid))
account(aid int, bid int, abal int, junk char(88), primary key(aid))
history(aid int, tid int, bid int, delta int, tstamp timestamp, primary 
key(tstamp,aid,delta))

All primary keys are numbered from 0 to n-1, where n are the number of 
rows in the table.  *bal columns are initially 0.
teller: bid=tid/10,  account:bid=aid/100000

I had 1000 branches, 10000 tellers and 100 million accounts. history 
table is initially empty

A transaction has 5 statements:
update account set abal = abal+? where aid=? and bid=?
insert into history values (?, ?, ?, ?, CURRENT_TIMESTAMP)
update teller set tbal = tbal+? where tid=? and bid=?
update branch set bbal = bbal+? where bid=?
select abal from account where aid=y

In TPC-B there are some rules about selecting teller and account where a 
certain percentage of transactions will use a teller from a different 
branch than the branch of the account, but I do not think that will 
matter here.  I suggest by random determining a balance change (we use a 
number between -1 million and 1 million), and a random aid and determine 
tid and bid based on the aid. (I.e., tid=aid/10000, bid=tid/100000).

-- 
Øystein

Mime
View raw message