db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rick Hillegas <Richard.Hille...@Sun.COM>
Subject Re: Performance issue of derby using JDBC
Date Wed, 17 Dec 2008 18:08:59 GMT
Hi Harshad,

Some comments inline...

Harshad wrote:
> Rick Hillegas wrote:
>> Depending on your data, the two queries could return different results,
>> so Bryan's faster query is not a transformation which Derby would
>> perform automatically. 
> Oh yes! Although my data ensures that the subquery would return distinct results, Derby
wouldn't know about it. Is there a way to say that the combination of two columns is unique?
> When I tried to specify DISTINCT explicitly like this:
> select name,version,release,time from PKG where PKG.id in (select DISTINCT PROVIDES.id
from PROVIDES where (PROVIDES.name = ?))
> ...I didn't get any change in the performance figures. That is probably because of the
"DISTINCT elimination in IN sub-queries" mentioned in "Tuning derby".
> I would have thought that the DISTINCT is not redundant and actually allows for the transformation
into a normal join.
I think that, for instance, a problem arises when there are M tuples in 
PROVIDES with name='Fred' but there are < M distinct ids in those 
tuples. In that case, you can imagine data sets on which the original 
query returns < M tuples but the transformed query returns M tuples.
>> Nevertheless, it looks to me as though your
>> original query almost qualifies for the EXISTS transformation documented
>> in the Derby Tuning Guide. 
> Why do you say "almost qualifies"; where is it not qualifying? And, will the EXISTS transformation
speed up the query as much as the transformation to a normal join?
I haven't looked under the hood here. The Tuning Guide doesn't 
completely describe the optimizer's reasoning. A smart optimizer won't 
unconditionally flatten subqueries.

Without having looked at the code (but knowing who wrote it), I suspect 
that the EXISTS transformation may be a normal join combined with a 
tuple filter which throws away duplicates on a sorted leg of the join.

It may be that Derby did apply the EXISTS transformation--and that 
degraded rather than improved the performance. If you are interested in 
seeing the plan which Derby selected, please consult the following wiki 


I realize that I'm raising more questions than I'm answering.

Hope this is a little useful,
> Thanks to all who have responded on this thread.
> I have learnt not to trust the execution time of Statement.executeQuery(). It looks like
the actual table access is made lazily (as and when the result set is accessed).

View raw message