db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suresh Thalamati <suresh.thalam...@gmail.com>
Subject Re: How Derby uses file descriptors
Date Fri, 02 Feb 2007 01:20:35 GMT
Dave Been wrote:
> Can anyone point me to any documentation on how Derby uses file 
> descriptors?

I don't know of any doc that descriptors the file descriptor usage in 
Derby. Here is what I know:

1) Container Cache (file cache).

Keeps track of open (files in seg0)  tables/indexes and temporary 
tables/indexes (DERBY_HOME/tmp)

Derby default container cache(file cache) max size is 100.  But if 
query involves more than 100, it will expand the cache as needed.  If 
a container is opened it can not throw out the container from the 
cache even if the max  limit is reached, which means it's file 
descriptor will be tied up. One container maps to one file 
(table/index) and it will have only one open file descriptor,
even if the tables is being used by multiple transaction/threads ..etc.

for example I recently looked at a query that has lot of 
tables/indexes (https://issues.apache.org/jira/browse/DERBY-2144) , it 
was opening more than 2048 files and was hitting a JVM bug. Query 
compilation can open up more files than during execution because it 
needs to open all indexes to find the best one and does not close 
those containers until optimization is done .

If your queries does not involve lot of tables/index derby should not
keep more than 100 files open at any time. I don't how aggressively 
cache manager shrinks or ages out opened containers that are not being 
in use.

2) Temporary file opens for sorting ( temp dir (derby_home/tmp)

3) transaction log file opens   ( log ) ; must be using just 2 at the 

4) db jar files opens , if there are any jar files stored in the are
    being used.

5) one db.lck file open , to prevent multi-jvm boot.

I hope that gave you some idea about the files descriptor usage in derby.

> Searches show problems in merges when creating indexes over large 
> tables, but thats not my issue.
> We have a large application which is already file descriptor intensive, 
> which is having problems embedding Derby due to its large file 
> descriptor use.
> I am proposing adding yet another Derby database (or using the existing 
> one) and would like to know:
>     * Is there anything the application can do to minimize the use of
>       file descriptors?

0) close the resultsets when they are no longer needed.
1) create indexes that can avoid sorts.

>     * would long running transactions vs "autocommit on" have any effect
>       on the # of file descs used (e.g. length of transaction)?

It should not matter to my knowledge, unless the application is not 
closing the resulsets. Some one please correct me if I am wrong here.
(I was debating myself before stating the above if  locks that are 
kept till then end of transaction requires keep the container open),
too lazy to double check the code).

>     * Number of tables open


>     * size of tables

Only in case of sort the size of the table might impact how many files 
are opened, otherwise it should not.

>     * use of PreparedStatements

I would think PreapareStatment/Regular Statement should not matter in 
this case,because  the statement cache should avoid compilation of the 
plans. But if your query is getting recompiled for some reason and 
statement cache is not helping, then Prepares statement might help in 
reducing the file descriptor usage.

hope that helps.

View raw message