db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rick Hillegas <rick.hille...@oracle.com>
Subject Re: Selecting a few rows is really slow when ordering by primary key
Date Fri, 27 Jan 2012 15:05:49 GMT
Hi Simon,

I am not an expert in how the optimizer chooses query plans. Hopefully, 
other people will be able to provide more insight.

 From the second plan, it is clear that the optimizer understands that 
the timestamp restriction will qualify only 97 rows. But the optimizer 
is unable to use this information to predict that the join will produce 
a small number of rows. The optimizer estimates that, despite the 
timestamp restriction, the join will produce 2836394 rows. That is 
probably why the optimizer rejects putting a sort on top of the second 
plan when considering plans for the first query, the one with the ORDER 
BY clause.

Sometimes these problems are fixed by regenerating the statistics used 
by the optimizer. You can regenerate statistics using the 
SYSCS_UTIL.SYSCS_UPDATE_STATISTICS procedure (see the section dealing 
with this procedure in the Derby Reference Guide).

If that doesn't work, please let us know. Logging a bug against the 
optimizer will help build the case for improving the optimizer. The 
following other solutions may help:

1) Try adding the primary key to the end of the timestamp index signature:

CREATE INDEX UPDATE_63402512_TIMESTAMP_LONG_IDX ON 
CERBERUS.UPDATE_63402512 (TIMESTAMP_LONG, PK);

2) Force the first query to use the indexes chosen by the second query 
by specifying optimizer overrides. See the section on optimizer 
overrides in the Tuning Guide.

Hope this helps,
-Rick

On 1/27/12 12:59 AM, Simon Chatelain wrote:
> Hello,
>
> I am facing an issue where a select query returning a few hundred rows 
> take a really long time (about 8 seconds) when the result is ordered 
> by a primary key value, and take almost no time (about 0.03 seconds) 
> when ordered by another column.
>
> I traced the query plan for both queries, but as I am no expert I am 
> not sure to understand them very well and I cannot see any way of 
> optimizing the slow query to make it as fast as the fast one.
>
> So my question is: could anyone explain to me why the slow query is 
> slow and more even important: is there a way to make it fast ?
>
> I tried with Derby 10.5 and 10.8 and both exhibits the same behavior.
>
>
> Thank in advance for any input..
>
> schatela
>
>
> Here are the details:
>
> *The database schema looks like this*
>
> CREATE TABLE CERBERUS.UPDATE_63402512 (
> PK INT NOT NULL PRIMARY KEY GENERATED ALWAYS AS IDENTITY (START WITH 
> 1, INCREMENT BY 1) ,
>     CATEGORY VARCHAR(30000) NOT NULL,
>     TIMESTAMP TIMESTAMP NOT NULL,
>     TIMESTAMP_LONG BIGINT NOT NULL);
>
> CREATE INDEX UPDATE_63402512_TIMESTAMP_IDX ON CERBERUS.UPDATE_63402512 
> (TIMESTAMP);
>
> CREATE INDEX UPDATE_63402512_TIMESTAMP_LONG_IDX ON 
> CERBERUS.UPDATE_63402512 (TIMESTAMP_LONG);
>
> CREATE TABLE CERBERUS.FIELD_63402512 (
>             PK INT NOT NULL PRIMARY KEY GENERATED ALWAYS AS IDENTITY 
> (START WITH 1, INCREMENT BY 1),
>             NAME VARCHAR(30000) NOT NULL,
>             STR_VALUE VARCHAR(30000),
>             LONG_VALUE  INT,
>             FK_MACHINE_UPDATE INT NOT NULL,
>             CONSTRAINT fk_FIELD_UPDATE_63402512
>              FOREIGN KEY (FK_MACHINE_UPDATE)
>                 REFERENCES CERBERUS.UPDATE_63402512 (PK)
>                 ON DELETE NO ACTION
>                 ON UPDATE NO ACTION);
>
> *the UPDATE table contains about 1.4 million rows and the FIELD table 
> contains a little less than 3 million rows.*
>
> *Then the slow query is as follow:*
>
> select MACHINE_UPDATE.PK <http://MACHINE_UPDATE.PK> MACHINE_UPDATE_PK,
>      MACHINE_UPDATE.CATEGORY MACHINE_UPDATE_CATEGORY,
>      MACHINE_UPDATE.TIMESTAMP MACHINE_UPDATE_TIMESTAMP,
>      MACHINE_UPDATE.TIMESTAMP_LONG MACHINE_UPDATE_TIMESTAMP_LONG,
> FIELD.NAME <http://FIELD.NAME> FIELD_NAME,
>      FIELD.STR_VALUE FIELD_STR_VALUE,
>      FIELD.LONG_VALUE FIELD_LONG_VALUE
>      fromUPDATE_63402512 MACHINE_UPDATE, FIELD_63402512 FIELD
>      where FIELD.FK_MACHINE_UPDATE = MACHINE_UPDATE.PK 
> <http://MACHINE_UPDATE.PK>
>      and TIMESTAMP_LONG <= 1313590989001
>      and TIMESTAMP_LONG >= 1313590969001
>      ORDER BY MACHINE_UPDATE_PK
>
> *And the fast query is:*
>
> select MACHINE_UPDATE.PK <http://MACHINE_UPDATE.PK> MACHINE_UPDATE_PK,
>      MACHINE_UPDATE.CATEGORY MACHINE_UPDATE_CATEGORY,
>      MACHINE_UPDATE.TIMESTAMP MACHINE_UPDATE_TIMESTAMP,
>      MACHINE_UPDATE.TIMESTAMP_LONG MACHINE_UPDATE_TIMESTAMP_LONG,
> FIELD.NAME <http://FIELD.NAME> FIELD_NAME,
>      FIELD.STR_VALUE FIELD_STR_VALUE,
>      FIELD.LONG_VALUE FIELD_LONG_VALUE
>  from UPDATE_63402512 MACHINE_UPDATE, FIELD_63402512 FIELD
>      where FIELD.FK_MACHINE_UPDATE = MACHINE_UPDATE.PK 
> <http://MACHINE_UPDATE.PK>
>      and TIMESTAMP_LONG <= 1313590989001
>      and TIMESTAMP_LONG >= 1313590969001
>      ORDER BY MACHINE_UPDATE_TIMESTAMP_LONG
>
> *The explain plan of the the slow query is this:*
>
> Statement Name:
>     SQL_CURLH000C1
> Statement Text:
>     select MACHINE_UPDATE.PK <http://MACHINE_UPDATE.PK> 
> MACHINE_UPDATE_PK,MACHINE_UPDATE.CATEGORY 
> MACHINE_UPDATE_CATEGORY,MACHINE_UPDATE.TIMESTAMP 
> MACHINE_UPDATE_TIMESTAMP,MACHINE_UPDATE.TIMESTAMP_LONG 
> MACHINE_UPDATE_TIMESTAMP_LONG,FIELD.NAME <http://FIELD.NAME> 
> FIELD_NAME,FIELD.STR_VALUE FIELD_STR_VALUE,FIELD.LONG_VALUE 
> FIELD_LONG_VALUE from CERBERUS.UPDATE_63402512 
> MACHINE_UPDATE,CERBERUS.FIELD_63402512 FIELD where 
> FIELD.FK_MACHINE_UPDATE = MACHINE_UPDATE.PK <http://MACHINE_UPDATE.PK> 
> and TIMESTAMP_LONG <= 1313590989001 and TIMESTAMP_LONG >= 
> 1313590969001  ORDER BY MACHINE_UPDATE_PK
> Parse Time: 0
> Bind Time: 0
> Optimize Time: 15
> Generate Time: 0
> Compile Time: 15
> Execute Time: 8970
> Begin Compilation Timestamp : 2012-01-26 16 
> <tel:2012-01-26%2016>:51:49.838
> End Compilation Timestamp : 2012-01-26 16 <tel:2012-01-26%2016>:51:49.853
> Begin Execution Timestamp : 2012-01-26 16 <tel:2012-01-26%2016>:51:49.853
> End Execution Timestamp : 2012-01-26 16 <tel:2012-01-26%2016>:51:58.839
> Statement Execution Plan Text:
> Project-Restrict ResultSet (7):
> Number of opens = 1
> Rows seen = 300
> Rows filtered = 0
> restriction = false
> projection = true
>     constructor time (milliseconds) = 0
>     open time (milliseconds) = 0
>     next time (milliseconds) = 8970
>     close time (milliseconds) = 0
>     restriction time (milliseconds) = 0
>     projection time (milliseconds) = 0
>     optimizer estimated row count:      2836394.00
>     optimizer estimated cost:      8328596.88
>
> Source result set:
>     Nested Loop Join ResultSet:
>     Number of opens = 1
>     Rows seen from the left = 100
>     Rows seen from the right = 300
>     Rows filtered = 0
>     Rows returned = 300
>         constructor time (milliseconds) = 0
>         open time (milliseconds) = 0
>         next time (milliseconds) = 8970
>         close time (milliseconds) = 0
>         optimizer estimated row count:      2836394.00
>         optimizer estimated cost:      8328596.88
>
>     Left result set:
>         Project-Restrict ResultSet (4):
>         Number of opens = 1
>         Rows seen = 1408086
>         Rows filtered = 1407986
>         restriction = true
>         projection = true
>             constructor time (milliseconds) = 0
>             open time (milliseconds) = 0
>             next time (milliseconds) = 8954
>             close time (milliseconds) = 0
>             restriction time (milliseconds) = 314
>             projection time (milliseconds) = 0
>             optimizer estimated row count:       153341.11
>             optimizer estimated cost:      2826935.53
>
>         Source result set:
>             Index Row to Base Row ResultSet for UPDATE_63402512:
>             Number of opens = 1
>             Rows seen = 1408086
>             Columns accessed from heap = {0, 1, 2, 3}
>                 constructor time (milliseconds) = 0
>                 open time (milliseconds) = 0
>                 next time (milliseconds) = 8500
>                 close time (milliseconds) = 0
>                 optimizer estimated row count:       153341.11
>                 optimizer estimated cost:      2826935.53
>
>                 Index Scan ResultSet for UPDATE_63402512 using 
> constraint SQL111123145514350 at read committed isolation level using 
> share row locking chosen by the optimizer
>                 Number of opens = 1
>                 Rows seen = 1408086
>                 Rows filtered = 0
>                 Fetch Size = 1
>                     constructor time (milliseconds) = 0
>                     open time (milliseconds) = 0
>                     next time (milliseconds) = 4744
>                     close time (milliseconds) = 0
>                     next time in milliseconds/row = 0
>
>                 scan information:
>                     Bit set of columns fetched={1}
>                     Number of columns fetched=1
>                     Number of deleted rows visited=0
>                     Number of pages visited=7728
>                     Number of rows qualified=1408086
>                     Number of rows visited=1408086
>                     Scan type=btree
>                     Tree height=3
>                     start position:
>     None
>                     stop position:
>     None
>                     qualifiers:
> None
>                     optimizer estimated row count:       153341.11
>                     optimizer estimated cost:      2826935.53
>
>
>     Right result set:
>         Index Row to Base Row ResultSet for FIELD_63402512:
>         Number of opens = 100
>         Rows seen = 300
>         Columns accessed from heap = {1, 2, 3, 4}
>             constructor time (milliseconds) = 0
>             open time (milliseconds) = 0
>             next time (milliseconds) = 0
>             close time (milliseconds) = 16
>             optimizer estimated row count:      2836394.00
>             optimizer estimated cost:      5501661.34
>
>             Index Scan ResultSet for FIELD_63402512 using constraint 
> fk_FIELD_UPDATE_63402512 at read committed isolation level using share 
> row locking chosen by the optimizer
>             Number of opens = 100
>             Rows seen = 300
>             Rows filtered = 0
>             Fetch Size = 1
>                 constructor time (milliseconds) = 0
>                 open time (milliseconds) = 0
>                 next time (milliseconds) = 0
>                 close time (milliseconds) = 16
>                 next time in milliseconds/row = 0
>
>             scan information:
>                 Bit set of columns fetched=All
>                 Number of columns fetched=2
>                 Number of deleted rows visited=0
>                 Number of pages visited=401
>                 Number of rows qualified=300
>                 Number of rows visited=400
>                 Scan type=btree
>                 Tree height=4
>                 start position:
> >= on first 1 column(s).
>     Ordered null semantics on the following columns:
> 0
>                 stop position:
> > on first 1 column(s).
>     Ordered null semantics on the following columns:
> 0
>                 qualifiers:
> None
>                 optimizer estimated row count:      2836394.00
>                 optimizer estimated cost:      5501661.34
>
> *and the explain plan of the fast query is:*
>
> Statement Name:
>     SQL_CURLH000C1
> Statement Text:
>     select MACHINE_UPDATE.PK <http://MACHINE_UPDATE.PK> 
> MACHINE_UPDATE_PK,MACHINE_UPDATE.CATEGORY 
> MACHINE_UPDATE_CATEGORY,MACHINE_UPDATE.TIMESTAMP 
> MACHINE_UPDATE_TIMESTAMP,MACHINE_UPDATE.TIMESTAMP_LONG 
> MACHINE_UPDATE_TIMESTAMP_LONG,FIELD.NAME <http://FIELD.NAME> 
> FIELD_NAME,FIELD.STR_VALUE FIELD_STR_VALUE,FIELD.LONG_VALUE 
> FIELD_LONG_VALUE from CERBERUS.UPDATE_63402512 
> MACHINE_UPDATE,CERBERUS.FIELD_63402512 FIELD where 
> FIELD.FK_MACHINE_UPDATE = MACHINE_UPDATE.PK <http://MACHINE_UPDATE.PK> 
> and TIMESTAMP_LONG <= 1313590989001 and TIMESTAMP_LONG >= 
> 1313590969001  ORDER BY MACHINE_UPDATE_TIMESTAMP_LONG
> Parse Time: 0
> Bind Time: 15
> Optimize Time: 78
> Generate Time: 0
> Compile Time: 93
> Execute Time: 32
> Begin Compilation Timestamp : 2012-01-26 16 
> <tel:2012-01-26%2016>:50:13.508
> End Compilation Timestamp : 2012-01-26 16 <tel:2012-01-26%2016>:50:13.601
> Begin Execution Timestamp : 2012-01-26 16 <tel:2012-01-26%2016>:50:13.601
> End Execution Timestamp : 2012-01-26 16 <tel:2012-01-26%2016>:50:13.648
> Statement Execution Plan Text:
> Project-Restrict ResultSet (6):
> Number of opens = 1
> Rows seen = 300
> Rows filtered = 0
> restriction = false
> projection = true
>     constructor time (milliseconds) = 0
>     open time (milliseconds) = 0
>     next time (milliseconds) = 32
>     close time (milliseconds) = 0
>     restriction time (milliseconds) = 0
>     projection time (milliseconds) = 0
>     optimizer estimated row count:      2836394.00
>     optimizer estimated cost:      5501851.37
>
> Source result set:
>     Nested Loop Join ResultSet:
>     Number of opens = 1
>     Rows seen from the left = 100
>     Rows seen from the right = 300
>     Rows filtered = 0
>     Rows returned = 300
>         constructor time (milliseconds) = 0
>         open time (milliseconds) = 0
>         next time (milliseconds) = 32
>         close time (milliseconds) = 0
>         optimizer estimated row count:      2836394.00
>         optimizer estimated cost:      5501851.37
>
>     Left result set:
>         Index Row to Base Row ResultSet for UPDATE_63402512:
>         Number of opens = 1
>         Rows seen = 100
>         Columns accessed from heap = {0, 1, 2, 3}
>             constructor time (milliseconds) = 0
>             open time (milliseconds) = 0
>             next time (milliseconds) = 0
>             close time (milliseconds) = 0
>             optimizer estimated row count:           97.00
>             optimizer estimated cost:          190.03
>
>             Index Scan ResultSet for UPDATE_63402512 using index 
> UPDATE_63402512_TIMESTAMP_LONG_IDX at read committed isolation level 
> using share row locking chosen by the optimizer
>             Number of opens = 1
>             Rows seen = 100
>             Rows filtered = 0
>             Fetch Size = 1
>                 constructor time (milliseconds) = 0
>                 open time (milliseconds) = 0
>                 next time (milliseconds) = 0
>                 close time (milliseconds) = 0
>                 next time in milliseconds/row = 0
>
>             scan information:
>                 Bit set of columns fetched=All
>                 Number of columns fetched=2
>                 Number of deleted rows visited=0
>                 Number of pages visited=3
>                 Number of rows qualified=100
>                 Number of rows visited=101
>                 Scan type=btree
>                 Tree height=3
>                 start position:
> >= on first 1 column(s).
>     Ordered null semantics on the following columns:
> 0
>                 stop position:
> > on first 1 column(s).
>     Ordered null semantics on the following columns:
> 0
>                 qualifiers:
> None
>                 optimizer estimated row count:           97.00
>                 optimizer estimated cost:          190.03
>
>
>     Right result set:
>         Index Row to Base Row ResultSet for FIELD_63402512:
>         Number of opens = 100
>         Rows seen = 300
>         Columns accessed from heap = {1, 2, 3, 4}
>             constructor time (milliseconds) = 0
>             open time (milliseconds) = 0
>             next time (milliseconds) = 16
>             close time (milliseconds) = 16
>             optimizer estimated row count:      2836394.00
>             optimizer estimated cost:      5501661.34
>
>             Index Scan ResultSet for FIELD_63402512 using constraint 
> fk_FIELD_UPDATE_63402512 at read committed isolation level using share 
> row locking chosen by the optimizer
>             Number of opens = 100
>             Rows seen = 300
>             Rows filtered = 0
>             Fetch Size = 1
>                 constructor time (milliseconds) = 0
>                 open time (milliseconds) = 0
>                 next time (milliseconds) = 16
>                 close time (milliseconds) = 16
>                 next time in milliseconds/row = 0
>
>             scan information:
>                 Bit set of columns fetched=All
>                 Number of columns fetched=2
>                 Number of deleted rows visited=0
>                 Number of pages visited=401
>                 Number of rows qualified=300
>                 Number of rows visited=400
>                 Scan type=btree
>                 Tree height=4
>                 start position:
> >= on first 1 column(s).
>     Ordered null semantics on the following columns:
> 0
>                 stop position:
> > on first 1 column(s).
>     Ordered null semantics on the following columns:
> 0
>                 qualifiers:
> None
>                 optimizer estimated row count:      2836394.00
>                 optimizer estimated cost:      5501661.34
>
>


Mime
View raw message