db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Knut Anders Hatlen <Knut.Hat...@Sun.COM>
Subject Re: Managing many databases
Date Wed, 16 Apr 2008 09:54:41 GMT
Rick Hillegas <Richard.Hillegas@Sun.COM> writes:

> Six Fried Rice wrote:

>> 1: What are the performance characteristics of using zipped or
>> jarred DBs? It doesn't bother me to unzip them, but I saw this
>> option in the documentation and I was curious.

The performance for zipped databases will depend on the access pattern
and the ratio between the size of the frequently accessed data in your
database and the size of the page cache. As long as all the data you
need to access is cached in the page cache, you should see no overhead
compared to having a normal database directory. If your application
frequently needs to fetch data into the cache (typically because the
page cache is smaller than the working set), you may see a small
overhead (roughly equivalent to the difference between
RandomAccessFile.read() and ZipInputStream.read() per page read into the
page cache).

>> 2: Are there any performance concerns with having many databases in
>> a single derby install? Would it be better to run one derby server,
>> with 1000 databases, or run multiple derby servers on the same
>> hardware and partition the databases across them? I'm not looking
>> for exact numbers, since they obviously depend on a lot of
>> factors. But in general, can I load a ton of databases into derby
>> server and be OK? (We have no problem throwing additional hardware
>> at this system as needed.)
> Hard to say. Most of our performance work has measured the performance
> of many clients hammering a single database. I don't know where Derby
> maxes  out in its ability to saturate multiple processors when you are
> running an application against many databases. I think that against a
> single database, there is a limit (4?) to the number of processors
> which a Derby server can keep busy. That may or may not scale up if
> your server is managing more than one database.

How many processors you can keep busy depends heavily on the type of
load. I have seen reports on derby-dev about machines with 32 processors
utilizing more than 80% of the available CPU power when running against
a single database (using the code in the 10.4 development branch). If
you have hot spots in your data (many threads accessing the same row),
you'll not be able to utilize that many processors, but since it seems
like each user of the system has his own private data set, that
shouldn't be a problem for you.

What could become a problem when you load 1000 different databases, is
that each of the databases has its own page cache, so you may have
problems finding the ideal page cache size. You'll probably end up using
a small page cache per database to make all the caches fit into the
JVM's heap, although the different users could have different needs
(some don't need all the cache they have been given, others need more
but cannot take the free space from one of the other databases). If all
users work against the same database, you can instead allocate a large,
shared page cache where those users that need much cache can utilize the
space other users don't need. That could give a more efficient use of
the available memory resources.

As Bryan mentioned, if you only have a limited number of databases
loaded at the same time and make sure the unused ones are shut down, it
shouldn't be problematic to have one database per user.

>> 3: Can derby server discover new databases if I simply copy (or
>> symlink?) a derby database directory to its DERBY_HOME? Or do the
>> databases need to be *created* programmatically through JDBC?

If I understand your question correctly, yes. If you create a symlink in
your server's derby.system.home directory pointing to a database
directory somewhere else in your file hierarchy, you should be able to
connect to the database via the network server by using a connection URL
on this form: jdbc:derby://hostname/name_of_symlink

Knut Anders

View raw message