db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rick Hillegas <Richard.Hille...@Sun.COM>
Subject Re: Performance issue of derby using JDBC
Date Wed, 17 Dec 2008 18:08:59 GMT
Hi Harshad,

Some comments inline...

Harshad wrote:
> Rick Hillegas wrote:
>   
>> Depending on your data, the two queries could return different results,
>> so Bryan's faster query is not a transformation which Derby would
>> perform automatically. 
>>     
>
> Oh yes! Although my data ensures that the subquery would return distinct results, Derby
wouldn't know about it. Is there a way to say that the combination of two columns is unique?
>
> When I tried to specify DISTINCT explicitly like this:
> select name,version,release,time from PKG where PKG.id in (select DISTINCT PROVIDES.id
from PROVIDES where (PROVIDES.name = ?))
>
> ...I didn't get any change in the performance figures. That is probably because of the
"DISTINCT elimination in IN sub-queries" mentioned in "Tuning derby".
>
> I would have thought that the DISTINCT is not redundant and actually allows for the transformation
into a normal join.
>   
I think that, for instance, a problem arises when there are M tuples in 
PROVIDES with name='Fred' but there are < M distinct ids in those 
tuples. In that case, you can imagine data sets on which the original 
query returns < M tuples but the transformed query returns M tuples.
>   
>> Nevertheless, it looks to me as though your
>> original query almost qualifies for the EXISTS transformation documented
>> in the Derby Tuning Guide. 
>>     
>
> Why do you say "almost qualifies"; where is it not qualifying? And, will the EXISTS transformation
speed up the query as much as the transformation to a normal join?
>   
I haven't looked under the hood here. The Tuning Guide doesn't 
completely describe the optimizer's reasoning. A smart optimizer won't 
unconditionally flatten subqueries.

Without having looked at the code (but knowing who wrote it), I suspect 
that the EXISTS transformation may be a normal join combined with a 
tuple filter which throws away duplicates on a sorted leg of the join.

It may be that Derby did apply the EXISTS transformation--and that 
degraded rather than improved the performance. If you are interested in 
seeing the plan which Derby selected, please consult the following wiki 
pages:

  http://wiki.apache.org/db-derby/PerformanceDiagnosisTips
  http://wiki.apache.org/db-derby/StmtExecutionPlan

I realize that I'm raising more questions than I'm answering.

Hope this is a little useful,
-Rick
> Thanks to all who have responded on this thread.
>
> I have learnt not to trust the execution time of Statement.executeQuery(). It looks like
the actual table access is made lazily (as and when the result set is accessed).
>
>   


Mime
View raw message