db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Mueller <thomas.tom.muel...@gmail.com>
Subject Re: Inaccuracies in H2's claims: Autocounter/Sequqnce-Numbers
Date Wed, 25 May 2011 21:07:43 GMT
Hi,

The feature comparison on the H2 web site should be fixed now:
http://h2database.com/html/features.html#comparison (you may need to
refresh the page). When I wrote it, the features were not available in
Derby yet. However, in some cases it was also clearly my fault because
I wasn't clear what I meant (for example 'user defined data types' was
the wrong expression; what I meant was 'domains' as in the 'create
domain' statement). Also, I should have added the database versions, I
have done that now. I hope the table is now more accurate, please tell
me if not. It's not in my interest to mislead anybody, but it's also
clear that the list can never be 'complete' (that would be very hard,
and it would be a very very long list). But I'm open to discuss what
should be on the list and what should not. I guess the best way to
discuss this is in the H2 Google Group.

Benchmarks: I know there is quite a controversy about benchmarks in
general and the benchmark results published on the H2 web site in
particular. I understand Derby better supports multiple connections
accessing the same set of tables concurrently. However I don't agree
that this is a very common use case (it's getting more common now that
computers tend to have more and more cores). H2 is mainly an embedded
database, and for this you generally tend to use very few connections.
SQLite for example only very recently added support for multiple
concurrent connections. I do understand using multiple connections is
a use case for H2, and the benchmark does include a test with multiple
connections accessing the database concurrently. But the performance
comparison page also states that this is mostly a single connection
benchmark run on one computer.

By the way, I also have a problem with benchmark results published by others:

- HSQLDB: the benchmark result on the HSQLDB web site. Why does it
include McKoi (which is no longer maintained) but not H2? I asked Fred
Toussi but didn't a meaningful answer. He basically said H2 is not
listed so I can't complain :-)

- Derby: at various conferences, for example JavaOne 2007 and Jazoon
2007, Derby included benchmark results for Derby and MySQL where Derby
won, but strangely the benchmark source code is not available. I asked
but I was told it's not possible to make it open source... hard to
believe if you ask me. The presentations are available at
http://home.online.no/~olmsan/publications/pres/

- db4o: I was told the PolePosition benchmark at
http://www.polepos.org/ was sponsored by db4o (which is relatively
easy to prove), but the page doesn't list that. There are some obvious
problems with the PolePosition benchmark (for example memory usage
isn't taken into account).

> the blanket statements made on the H2 site have
> lead to a misperception on the part of the user community

Well, one solution is to not publish any number. But any meaningful
benchmark needs to do that. You have to pick what you believe is a
good set of use cases, and you have to present the data in some way.

> It made
> it quite difficult for me to justify running our own independent benchmark
> to management when they were looking at the benchmark on the H2 site.

Well, it should be clear to your management that _your_ use case may
not match the benchmark _I_ wrote.

> For example, you could Derby in-memory if you didn't care about durability... turn off
synchronous logging to disk if database consistency after a crash isn't an issue. One can
lower the default isolation mode to avoid lock contention etc...

Well... I did that and Derby was still slow. What database URL should
I use to get the highest possible performance?

> I thought blanket statements and whitewashing were the exclusive domain of
> commercial software!

Of course there is competition in the open source world as well (it
would be bad if there isn't). The difference between open source and
commercial is that commercial software tends to "disallow" publishing
any benchmark results. For example it's not allowed to publish numbers
for Oracle, or any other bigger commercial database I know. When I
wrote Hypersonic SQL I also published benchmark results, and then I
got a mail from somebody from Cloudscape (now called Apache Derby)
which basically read "Did you not read our license? You are not
allowed to publish any results. Remove the benchmark results from your
web site or we will sue you." Unfortunately I don't have this email
any more :-) With open source, you have open mailing lists and you can
discuss it. Also the licenses tend to be more liberal.

Regards,
Thomas

Mime
View raw message