db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kristian Waagan <Kristian.Waa...@Sun.COM>
Subject Re: Speed of using Derby DB
Date Thu, 19 Jun 2008 12:52:03 GMT
vodarus vodarus wrote:
>     Hi,
>     By using Øysteins approach I was able to get the time down to 2.4
>     seconds on my machine, on which the client [1] and stored procedure
>     code took around 12 seconds. The best I could get on the latter,
>     tweaking page cache size and page size, was around 8 seconds.
>     By cheating and removing some durability guarantees, I got down to a
>     best time (not quite stable) of 1.5 seconds using Øysteins suggestion.
>     I was surprised of the high disk activity seen when running the
>     code. Lots of writes are taking place, which I did not quite expect
>     for Øysteins query. But I do not know the implementation or the
>     algorithm being used.
>     There also seem to be some overhead invoking a stored procedure, as
>     the client [1] code is faster. This would of course look different
>     if the network JDBC driver was used, as you wouldn't have to
>     transfer the data over the wire.
>     To me it seems what takes most of the time is updating the result table.
>     So in short, no fresh ideas! Anyone else?
>     I didn't try using batches for the updated though.
>     PS: Note that your pageSize setting is invalid (must be one of 4096,
>     8192, 16384, or 32768) and Derby will silently ignore it and use the
>     default...
>     -- 
>     Kristian
>     [1] Note that client in this case still refers to the embedded
>     driver, but the code composing the stored procedure is invoked from
>     the driver side instead of "inside" the database.
> Hello )))
> I set pageSize to 32768, but result time seems near 11-12 sec.

Just to be sure, you did recreate the tables?
In any case, the page size would mostly help pull data in faster and 
that doesn't matter for this test.

> What is the "Øysteins approach "? Can you write steps to get 2.4 seconds 
> time?

Øysteins approach is using the query "insert into testtotals select 
client, sum(order_amount) from testbig group by client;".
As you state, this is not what you want in your case and it might not be 

I could also get down to these times by using a HashMap to store the 
intermediate totals in 'calculateTotalCommon'. This does of course use 
more memory and might cause trouble if you don't know the number of 
clients in your table (i.e. whether you need 25 thousand or 100 million 
entries in the map).

> Thanks.
> PS "To me it seems what takes most of the time is updating the result 
> table." But what is the problem there? I commit data at the end, so DBMS 
> should not do any writes ...

It seems what happens is that the log buffer goes full. By increasing 
the log buffer, I was able to get a little better performance. As always 
with tuning, it's about balance and tradeoffs. If your IO system is 
really good, maybe you can run with a big log buffer and get better 
performance. However, the effect you see from this also depends on how 
often you have commits (then the complete buffer is flushed anyway, at 
least in Derby).

So, in short, experiment with the following, using either the "insert 
into..." query or your client code modified to somehow store the totals 
in memory:
  a) Log buffer size
  b) Page cache size (and JVM heap)
  c) Page size

One of my attempts looked like this:
java -Xmx512M -Dderby.storage.pageSize=32768 
-Dderby.storage.logBufferSize=524288 -Dderby.storage.pageCacheSize=2500 
-cp .:${JDB10413} derbytest.FatTest

Using your original test code I haven't been able to get lower than 
around 5 seconds (best), the average being somewhere around 6 seconds.

As always, you have to do your own tests on your own system to see if it 
is good enough for your use :)
Often there are other things to consider besides performance, for 
instance installation and ease of use.

Does anyone have any ideas on other possible tunings?


View raw message