db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Armbrust <daniel.armbrust.l...@gmail.com>
Subject Re: Derby performance issues
Date Thu, 23 Jul 2009 14:38:12 GMT
> The following are just some questions and general pieces of advice.
> Hopefully they can help guide us in the right direction...
> - Which version of Derby and Java are you running?

Derby, JDK 1.6.0_14

>- A problem many users have run into, is outdated cardinality statistics for the indexes.

Is there a specific pattern that leads to this issue?  I just started
with a blank database, and started running my test, which will (at
first) add a few thousand new rows to various tables, and then, as the
test continues, delete/update these existing rows, but without growing
further.  Do I need to occasionally force Derby to rebuild its indexes
and/or statistics?

> - Note that the page size must be set before you create the tables.

Hmm, that's good to know.  I picked up the page size parameter from
the tuning guide
(http://db.apache.org/derby/docs/dev/tuning/tuning-single.html) which
fails to mention that tidbit.

> - You have allocated 10000 pages for the cache, ~80 MB (with 8K pages) +
> overhead. Is this size comparable to the cache used by the other database
> systems?

Yes, it's certainly big enough to hold my entire database, anyway.

> - What's the size of the database (compared to the page cache)?

Less than 10 MB, with the tests I'm running right now.

>> One problem I suspect I have is tuning how Derby handles fsync.  Are
>> there settings for this?  I don't need commits to be written to disk
>> at commit time - buffered is ok, so long as the database can recover
>> from an unexpected shutdown.
> What do you mean by recover from an unexpected shutdown?

In PostgreSQL, for example, they have an fsync setting.  If you turn
fsync off, they PostgreSQL does not wait for data to get flushed all
the way to disk after a commit, it simply sends it to the OS API, and
calls it good.  Turning this off can greatly improve performance -
however - on PostgreSQL, if you have fsync off, and you pull the plug
on the server, you have a good chance of corrupting your database in
such a way that it is not recoverable.

This sounds quite a bit like the documentation of derby.system.durability=test.

To address this in PostgreSQL, they added a second option, called
synchronous_commit, which allows you to get most of the performance
improvement of turning off fsync, but without risking database
corruption in an unexpected shutdown - at worst, you may lose some
recent transactions that you thought were committed, but had not yet
been written to disk (which is fine, in my use case)

> If you enable the write cache on the disk, Derby would be able to start up
> again, but you will most likely loose some of the most recent transactions
> in case of power failures etc.

When you say write cache, do you mean a derby setting (I can't find
it) or an OS setting?

This presentation seems to imply an OS setting:
http://db.apache.org/derby/binaries/DerbyPerfDurability-2006.pdf - and
it also states that at least 3 out of 10 times, Derby failed to
recover the corrupted DB.

After I wrote up the e-mail, and did some additional testing, and it
appears that Derby has quite a long warm-up time for its caches - the
performance did improve, somewhat, after it had been running for a



View raw message