db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Matrigali <mikem_...@sbcglobal.net>
Subject Re: Index perfomance
Date Thu, 26 Mar 2009 00:17:05 GMT
A reproducible test case submitted to jira is the best way to get help 
from the list.  This would answer all the following questions:
1) what is row size?
2) what is table size - assuming you have just one big table easiest 
might be just to do a listing of seg0 and give the size of the biggest file.
3) could you post the jdbc you use.
4) what is the query plan when you add the index.
5) what is the exact ddl of the created table and what is the exact ddl 
of all the indexes?
6) what happens if you add an index on (double column, pk_column) 
instead?  A possible problem with the query is that even though you only 
look at exactly the 271136 rows out of 4M in the index, the query then 
has to do
a separate probe to the base table to get the pk_column.  Depending on
row size and cache hits the worst case could mean that each requires a
real I/O.  Where the table scan is guaranteed to only to read each page 
once, and it does it in order which tends to get free big block I/O on 
most OS's.
7) Do you create the index after or before loading the rows?

The following wiki may also help:

http://wiki.apache.org/db-derby/PerformanceDiagnosisTips?highlight=(performance)

Answers to above would help.
Tim Dudgeon wrote:
> I found an interesting preformace problem that I'd welcome some help in 
> inderstanding.
> 
> I have a table of 4,000,000 rows that has a DOUBLE column containing 
> floating point numbers.
> I run a query like this:
> 
> select pk_column from the_table where the_double_column > 300 and 
> the_double_column < 320
> 
> 271136 rows are returned.
> I then go through the ResultSet and extract all the id_column values.
> All of this is done using standard JDBC.
> 
> When I do this it takes 23 seconds, which I though was not unreasonable 
> as a full table scan was involved and the table was pretty big.
> 
> But of course I thought an index would help, so I added an index to the 
> the_double_column and repeated the process. It took 720 seconds, 31x 
> slower! I thought this was strange, but thought it might be because I 
> was using standard settings and the 4MB page cache was much too small to 
> hold the index.
> 
> So I increased the page cache size (derby.storage.pageCacheSize 
> property) to 10x the size (10,000) and repeated the process. There was 
> only a very minor improvement in speed.
> 
> In all cases the memory usage, as reported by:
> Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory()
> really no differnt, and the used memory was much less that the maximum 
> available specified by the -Xmx setting.
> 
> 
> Any ideas what to do?
> 
> Tim
> 
> 


Mime
View raw message