db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rick Hillegas <Richard.Hille...@Sun.COM>
Subject Re: performance, memory consumption...and a big hello :)
Date Wed, 29 Jul 2009 13:43:02 GMT
Hi Tomislav,

Some more comments inline...

Tomi N/A wrote:
> Thank you both for the comments.
>
> Jan, I've kind of taken your advice even before I saw it :), but I
> tried PostgreSQL, unfortunately, with no obvious improvement.
>
> 2009/7/27 Rick Hillegas <Richard.Hillegas@sun.com>:
>
>   
>>> select id_group, count(id)
>>> group by id_group
>>>
>>>       
>> Not to be tiresome, but just to make sure we understand the query: I think
>> you mean id_group1 instead of id_group.
>>     
>
> Exactly right.
>
>   
>> It sounds as though you may have created an index on ( id_group1, id ). At
>> first blush, I would expect that index to improve the performance. Did that
>> not happen?
>>     
>
> That is exactly what I observed.
> As I told Jan, I've tried it out on PostgreSQL in the meantime because
> a) I wanted to see how a different RDBMS would behave and b) because
> derby (it seems) doesn't support the "explain" command to show the
> query plan and PostgreSQL does.
>   
Some great work has been put into improving Derby's "explain" 
capabilities in the next feature release. In the meantime, you can view 
Derby query plans by using the SYSCS_UTIL.SYSCS_SET_RUNTIMESTATISTICS() 
system procedure. Please see the Derby Tuning Guide: 
http://db.apache.org/derby/docs/10.5/tuning/tuning-single.html#ctundepth13055
> I've tried the following indexes:
>   
> (id)
>   
This index doesn't contain the id_group1 column, so a probe of the base 
row would always be needed. A sequential scan of the heap would be 
faster than a series of probes.
> (id_group1)
>   
Same situation for this index.
> (id, id_group1)
>   
This is a covering index for the query. That is, it carries both of the 
columns you need. That eliminates the need for probes of the heap. 
However, the index is not in the order you want (sorted by the grouping 
column). So an expensive sort would be needed.
> but postgresql announced a sequential table scan every time and, in
> fact, seemed to have done exactly that when I executed the query.
>   
Have you tried

( id_group1, id )

This index covers your query and is in a useful sort order. If Derby 
does not use this index on your large data set, then there may be a bug. 
It would be interesting to see the query plan which Derby follows when 
you have declared this index.

Hope this helps,
-Rick
> Either I'm doing something obviously wrong or am assuming something
> obviously impossible is possible, but I'm inclined to believe the
> former, rather than the latter which is why I'm writing. :-)
>
> Cheers,
> Tomislav
>   


Mime
View raw message