db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Geoffrey Hendrey <geoff_hend...@yahoo.com>
Subject Re: Derby/HSQLDB major performance difference
Date Mon, 09 Mar 2009 05:07:59 GMT


I've used hsqldb quite a bit. It's basically designed to read all the data into memory. In
that regard, while it does support a SQL interface, it's really lacking in scalability. It's
great for prototyping 

On Mar 8, 2009, at 9:58 PM, "Jeff Stuckman" <stuckman@umd.edu> wrote:

Hello,

Did you try setting derby.system.durability=test and rerunning your
benchmark?

>From what I understand, Derby provides hard guarantees of durability -- if
there is a power outage, system crash, or disk failure anytime after your
commit() call has returned, the data is guaranteed to be available when your
system comes back up. To guarantee this durability, Derby needs to work
around the write caching that your OS will normally perform, which reduces
performance. This will cause the very long insertion period that you see.

(Derby even includes support for XA (distributed) transactions, which are
impossible to properly support without durability guarantees)

I couldn't find any information on durability on the HSQLDB website, and
from the performance results that you describe, I'm inclined to believe that
HSQLDB does not make this guarantee. If you set the above property, Derby
will reduce its durability guarantees and perform faster.

Jeff

-----Original Message-----
From: DerbyNovice [mailto:clarsson@ureason.com] 
Sent: Friday, March 06, 2009 12:39 PM
To: derby-user@db.apache.org
Subject: Derby/HSQLDB major performance difference


I am using Derby as an embedded db in my swing application. Recently I
decided to have a go at HSQLDB (cached tables, embedded) to see  how it
coped. I have written a test program which
*  inserts a number of records in my db with random keys
* makes an index on the keys. 
* runs a number of select statements
* updates a number of records with new random values.

At the same time I measure lapse time and memory in a separate thread.
I made the same run with Derby and with HSQLDB, see the two uploaded charts,
with -Xmx1024m .
Initially the idea was to see which db was faster, but as soon as I saw the
results I realised there are
other differences.
The scale on the x-axis is half seconds, i 1000 is 500 seconds. The scale on
the y-axis is bytes as reported by gc.
Notice the difference in scale between HSQLDB and derby. I have tried to
optimise the memory with HSQLDB options
but it has only marginal difference and it does not change the behaviour.
Observations:
* HSQLDB uses a magnitude more memory than Derby.
* HSQLDB does not seem to benefit from the indices.
* HSQLDB is faster in total, but not to the extent the memory usage
suggests.
* Derby uses a very long insertion  period but the select statements are
very fast and memory lean
* Derby manages the memory during the run, the total memory goes up AND
DOWN.
* Derby seems to struggle (timewise) with the inserts (the long slope
initially) but breeze through the select statements
which all take less than a second.
The run shown uses 700000 records, but smaller runs show the same behaviour.
For me this makes HSQLDB useless as it would gradually eat my applications
memory. Anyone trying to weigh performance benefits between db's should be
aware of these very different characteristics.

I'd be pleased if anyone would care to comment on the test run and maybe
shed some light on the totally different characteristics seen here.
I'd be happy to upload the timing tests and my program too if there is an
interest.

Regards,


DERBY RUN
http://www.nabble.com/file/p22377140/mem.gif 
HSQL RUN
http://www.nabble.com/file/p22377140/mem.gif 
-- 
View this message in context:
http://www.nabble.com/Derby-HSQLDB-major-performance-difference-tp22377140p2
2377140.html
Sent from the Apache Derby Users mailing list archive at Nabble.com.




Mime
View raw message