I haven’t looked at Derby’s code, but just because you haven’t ‘committed’ the transaction doesn’t mean that Derby isn’t writing the records.

Without a commit, depending on your isolation level, other users won’t see or feel the effects of your data. The transactions are stored in a log so that you can rollback the transaction, the data is still getting written.


(A simple test would be to write a record of a known length, X number of times, on a machine where the JVM has a set restriction size. If you think everything is in memory and not being written to disk, then after Y number of iterations, you will fill up the heap and go boom.)


But Derby can be different so what do I know?


From: vodarus vodarus [mailto:vodarus@gmail.com]
Sent: Thursday, June 19, 2008 3:43 AM
To: Derby Discussion
Subject: Re: Speed of using Derby DB



By using Řysteins approach I was able to get the time down to 2.4 seconds on my machine, on which the client [1] and stored procedure code took around 12 seconds. The best I could get on the latter, tweaking page cache size and page size, was around 8 seconds.

By cheating and removing some durability guarantees, I got down to a best time (not quite stable) of 1.5 seconds using Řysteins suggestion.

I was surprised of the high disk activity seen when running the code. Lots of writes are taking place, which I did not quite expect for Řysteins query. But I do not know the implementation or the algorithm being used.

There also seem to be some overhead invoking a stored procedure, as the client [1] code is faster. This would of course look different if the network JDBC driver was used, as you wouldn't have to transfer the data over the wire.

To me it seems what takes most of the time is updating the result table.

So in short, no fresh ideas! Anyone else?
I didn't try using batches for the updated though.

PS: Note that your pageSize setting is invalid (must be one of 4096, 8192, 16384, or 32768) and Derby will silently ignore it and use the default...


[1] Note that client in this case still refers to the embedded driver, but the code composing the stored procedure is invoked from the driver side instead of "inside" the database.

Hello )))

I set pageSize to 32768, but result time seems near 11-12 sec.

What is the "Řysteins approach "? Can you write steps to get 2.4 seconds time?


PS "To me it seems what takes most of the time is updating the result table." But what is the problem there? I commit data at the end, so DBMS should not do any writes ...