Hello Rick,

Thank you very much for your answer.
Some time ago I already tried to regenerate the statistics of the UPDATE table only (as both the order by and the timestamp restriction was on that table, I thought that if there was a problem with the statistics, it should be on that table) but without effect.
BUT this time I also regenerate the statistics of the FIELD table, and that worked ! Now the slow query is as fast as the fast one.

I also tried to create the index on both columns (TIMESTAMP_LONG and PK) but without effect.

And finally adding an optimizer override as follow:

select MACHINE_UPDATE.PK MACHINE_UPDATE_PK,
     MACHINE_UPDATE.CATEGORY MACHINE_UPDATE_CATEGORY,
     MACHINE_UPDATE.TIMESTAMP MACHINE_UPDATE_TIMESTAMP,
     MACHINE_UPDATE.TIMESTAMP_LONG MACHINE_UPDATE_TIMESTAMP_LONG,
     FIELD.NAME FIELD_NAME,
     FIELD.STR_VALUE FIELD_STR_VALUE,
     FIELD.LONG_VALUE FIELD_LONG_VALUE
     from UPDATE_63402512 MACHINE_UPDATE --DERBY-PROPERTIES index=
UPDATE_63402512_TIMESTAMP_LONG_IDX
             , FIELD_63402512 FIELD
     where FIELD.FK_MACHINE_UPDATE = MACHINE_UPDATE.PK
     and TIMESTAMP_LONG <= 1313590989001
     and TIMESTAMP_LONG >= 1313590969001
     ORDER BY MACHINE_UPDATE_PK


works as well, the query plan is the same that the plan of the fast query.

So for now my problem is solved. Than you very much.

But one small question remains, do you think that it is enough to run SYSCS_UPDATE_STATIS
TICS only once, or should I plan to run this procedure at regular interval (let's say once a week or once a month) in case the statistics are not correctly updated as new rows are inserted in the database ?

Simon


On Fri, Jan 27, 2012 at 4:05 PM, Rick Hillegas <rick.hillegas@oracle.com> wrote:
Hi Simon,

I am not an expert in how the optimizer chooses query plans. Hopefully, other people will be able to provide more insight.

>From the second plan, it is clear that the optimizer understands that the timestamp restriction will qualify only 97 rows. But the optimizer is unable to use this information to predict that the join will produce a small number of rows. The optimizer estimates that, despite the timestamp restriction, the join will produce 2836394 rows. That is probably why the optimizer rejects putting a sort on top of the second plan when considering plans for the first query, the one with the ORDER BY clause.

Sometimes these problems are fixed by regenerating the statistics used by the optimizer. You can regenerate statistics using the SYSCS_UTIL.SYSCS_UPDATE_STATISTICS procedure (see the section dealing with this procedure in the Derby Reference Guide).

If that doesn't work, please let us know. Logging a bug against the optimizer will help build the case for improving the optimizer. The following other solutions may help:

1) Try adding the primary key to the end of the timestamp index signature:

CREATE INDEX UPDATE_63402512_TIMESTAMP_LONG_IDX ON CERBERUS.UPDATE_63402512 (TIMESTAMP_LONG, PK);

2) Force the first query to use the indexes chosen by the second query by specifying optimizer overrides. See the section on optimizer overrides in the Tuning Guide.

Hope this helps,
-Rick


On 1/27/12 12:59 AM, Simon Chatelain wrote:
Hello,

I am facing an issue where a select query returning a few hundred rows take a really long time (about 8 seconds) when the result is ordered by a primary key value, and take almost no time (about 0.03 seconds) when ordered by another column.

I traced the query plan for both queries, but as I am no expert I am not sure to understand them very well and I cannot see any way of optimizing the slow query to make it as fast as the fast one.

So my question is: could anyone explain to me why the slow query is slow and more even important: is there a way to make it fast ?

I tried with Derby 10.5 and 10.8 and both exhibits the same behavior.


Thank in advance for any input..

schatela


Here are the details:

*The database schema looks like this*


CREATE TABLE CERBERUS.UPDATE_63402512 (
PK INT NOT NULL PRIMARY KEY GENERATED ALWAYS AS IDENTITY (START WITH 1, INCREMENT BY 1) ,
   CATEGORY VARCHAR(30000) NOT NULL,
   TIMESTAMP TIMESTAMP NOT NULL,
   TIMESTAMP_LONG BIGINT NOT NULL);

CREATE INDEX UPDATE_63402512_TIMESTAMP_IDX ON CERBERUS.UPDATE_63402512 (TIMESTAMP);

CREATE INDEX UPDATE_63402512_TIMESTAMP_LONG_IDX ON CERBERUS.UPDATE_63402512 (TIMESTAMP_LONG);

CREATE TABLE CERBERUS.FIELD_63402512 (
           PK INT NOT NULL PRIMARY KEY GENERATED ALWAYS AS IDENTITY (START WITH 1, INCREMENT BY 1),
           NAME VARCHAR(30000) NOT NULL,
           STR_VALUE VARCHAR(30000),
           LONG_VALUE  INT,
           FK_MACHINE_UPDATE INT NOT NULL,
           CONSTRAINT fk_FIELD_UPDATE_63402512
            FOREIGN KEY (FK_MACHINE_UPDATE)
               REFERENCES CERBERUS.UPDATE_63402512 (PK)
               ON DELETE NO ACTION
               ON UPDATE NO ACTION);

*the UPDATE table contains about 1.4 million rows and the FIELD table contains a little less than 3 million rows.*

*Then the slow query is as follow:*

select MACHINE_UPDATE.PK <http://MACHINE_UPDATE.PK> MACHINE_UPDATE_PK,

    MACHINE_UPDATE.CATEGORY MACHINE_UPDATE_CATEGORY,
    MACHINE_UPDATE.TIMESTAMP MACHINE_UPDATE_TIMESTAMP,
    MACHINE_UPDATE.TIMESTAMP_LONG MACHINE_UPDATE_TIMESTAMP_LONG,
FIELD.NAME <http://FIELD.NAME> FIELD_NAME,

    FIELD.STR_VALUE FIELD_STR_VALUE,
    FIELD.LONG_VALUE FIELD_LONG_VALUE
    fromUPDATE_63402512 MACHINE_UPDATE, FIELD_63402512 FIELD
    where FIELD.FK_MACHINE_UPDATE = MACHINE_UPDATE.PK <http://MACHINE_UPDATE.PK>

    and TIMESTAMP_LONG <= 1313590989001
    and TIMESTAMP_LONG >= 1313590969001
    ORDER BY MACHINE_UPDATE_PK

*And the fast query is:*

select MACHINE_UPDATE.PK <http://MACHINE_UPDATE.PK> MACHINE_UPDATE_PK,

    MACHINE_UPDATE.CATEGORY MACHINE_UPDATE_CATEGORY,
    MACHINE_UPDATE.TIMESTAMP MACHINE_UPDATE_TIMESTAMP,
    MACHINE_UPDATE.TIMESTAMP_LONG MACHINE_UPDATE_TIMESTAMP_LONG,
FIELD.NAME <http://FIELD.NAME> FIELD_NAME,

    FIELD.STR_VALUE FIELD_STR_VALUE,
    FIELD.LONG_VALUE FIELD_LONG_VALUE
 from UPDATE_63402512 MACHINE_UPDATE, FIELD_63402512 FIELD
    where FIELD.FK_MACHINE_UPDATE = MACHINE_UPDATE.PK <http://MACHINE_UPDATE.PK>

    and TIMESTAMP_LONG <= 1313590989001
    and TIMESTAMP_LONG >= 1313590969001
    ORDER BY MACHINE_UPDATE_TIMESTAMP_LONG

*The explain plan of the the slow query is this:*


Statement Name:
   SQL_CURLH000C1
Statement Text:
   select MACHINE_UPDATE.PK <http://MACHINE_UPDATE.PK> MACHINE_UPDATE_PK,MACHINE_UPDATE.CATEGORY MACHINE_UPDATE_CATEGORY,MACHINE_UPDATE.TIMESTAMP MACHINE_UPDATE_TIMESTAMP,MACHINE_UPDATE.TIMESTAMP_LONG MACHINE_UPDATE_TIMESTAMP_LONG,FIELD.NAME <http://FIELD.NAME> FIELD_NAME,FIELD.STR_VALUE FIELD_STR_VALUE,FIELD.LONG_VALUE FIELD_LONG_VALUE from CERBERUS.UPDATE_63402512 MACHINE_UPDATE,CERBERUS.FIELD_63402512 FIELD where FIELD.FK_MACHINE_UPDATE = MACHINE_UPDATE.PK <http://MACHINE_UPDATE.PK> and TIMESTAMP_LONG <= 1313590989001 and TIMESTAMP_LONG >= 1313590969001  ORDER BY MACHINE_UPDATE_PK

Parse Time: 0
Bind Time: 0
Optimize Time: 15
Generate Time: 0
Compile Time: 15
Execute Time: 8970
Begin Compilation Timestamp : 2012-01-26 16 <tel:2012-01-26%2016>:51:49.838
End Compilation Timestamp : 2012-01-26 16 <tel:2012-01-26%2016>:51:49.853
Begin Execution Timestamp : 2012-01-26 16 <tel:2012-01-26%2016>:51:49.853
End Execution Timestamp : 2012-01-26 16 <tel:2012-01-26%2016>:51:58.839

Statement Execution Plan Text:
Project-Restrict ResultSet (7):
Number of opens = 1
Rows seen = 300
Rows filtered = 0
restriction = false
projection = true
   constructor time (milliseconds) = 0
   open time (milliseconds) = 0
   next time (milliseconds) = 8970
   close time (milliseconds) = 0
   restriction time (milliseconds) = 0
   projection time (milliseconds) = 0
   optimizer estimated row count:      2836394.00
   optimizer estimated cost:      8328596.88

Source result set:
   Nested Loop Join ResultSet:
   Number of opens = 1
   Rows seen from the left = 100
   Rows seen from the right = 300
   Rows filtered = 0
   Rows returned = 300
       constructor time (milliseconds) = 0
       open time (milliseconds) = 0
       next time (milliseconds) = 8970
       close time (milliseconds) = 0
       optimizer estimated row count:      2836394.00
       optimizer estimated cost:      8328596.88

   Left result set:
       Project-Restrict ResultSet (4):
       Number of opens = 1
       Rows seen = 1408086
       Rows filtered = 1407986
       restriction = true
       projection = true
           constructor time (milliseconds) = 0
           open time (milliseconds) = 0
           next time (milliseconds) = 8954
           close time (milliseconds) = 0
           restriction time (milliseconds) = 314
           projection time (milliseconds) = 0
           optimizer estimated row count:       153341.11
           optimizer estimated cost:      2826935.53

       Source result set:
           Index Row to Base Row ResultSet for UPDATE_63402512:
           Number of opens = 1
           Rows seen = 1408086
           Columns accessed from heap = {0, 1, 2, 3}
               constructor time (milliseconds) = 0
               open time (milliseconds) = 0
               next time (milliseconds) = 8500
               close time (milliseconds) = 0
               optimizer estimated row count:       153341.11
               optimizer estimated cost:      2826935.53

               Index Scan ResultSet for UPDATE_63402512 using constraint SQL111123145514350 at read committed isolation level using share row locking chosen by the optimizer
               Number of opens = 1
               Rows seen = 1408086
               Rows filtered = 0
               Fetch Size = 1
                   constructor time (milliseconds) = 0
                   open time (milliseconds) = 0
                   next time (milliseconds) = 4744
                   close time (milliseconds) = 0
                   next time in milliseconds/row = 0

               scan information:
                   Bit set of columns fetched={1}
                   Number of columns fetched=1
                   Number of deleted rows visited=0
                   Number of pages visited=7728
                   Number of rows qualified=1408086
                   Number of rows visited=1408086
                   Scan type=btree
                   Tree height=3
                   start position:
   None
                   stop position:
   None
                   qualifiers:
None
                   optimizer estimated row count:       153341.11
                   optimizer estimated cost:      2826935.53


   Right result set:
       Index Row to Base Row ResultSet for FIELD_63402512:
       Number of opens = 100
       Rows seen = 300
       Columns accessed from heap = {1, 2, 3, 4}
           constructor time (milliseconds) = 0
           open time (milliseconds) = 0
           next time (milliseconds) = 0
           close time (milliseconds) = 16
           optimizer estimated row count:      2836394.00
           optimizer estimated cost:      5501661.34

           Index Scan ResultSet for FIELD_63402512 using constraint fk_FIELD_UPDATE_63402512 at read committed isolation level using share row locking chosen by the optimizer
           Number of opens = 100
           Rows seen = 300
           Rows filtered = 0
           Fetch Size = 1
               constructor time (milliseconds) = 0
               open time (milliseconds) = 0
               next time (milliseconds) = 0
               close time (milliseconds) = 16
               next time in milliseconds/row = 0

           scan information:
               Bit set of columns fetched=All
               Number of columns fetched=2
               Number of deleted rows visited=0
               Number of pages visited=401
               Number of rows qualified=300
               Number of rows visited=400
               Scan type=btree
               Tree height=4
               start position:
>= on first 1 column(s).
   Ordered null semantics on the following columns:
0
               stop position:
> on first 1 column(s).
   Ordered null semantics on the following columns:
0
               qualifiers:
None
               optimizer estimated row count:      2836394.00
               optimizer estimated cost:      5501661.34

*and the explain plan of the fast query is:*


Statement Name:
   SQL_CURLH000C1
Statement Text:
   select MACHINE_UPDATE.PK <http://MACHINE_UPDATE.PK> MACHINE_UPDATE_PK,MACHINE_UPDATE.CATEGORY MACHINE_UPDATE_CATEGORY,MACHINE_UPDATE.TIMESTAMP MACHINE_UPDATE_TIMESTAMP,MACHINE_UPDATE.TIMESTAMP_LONG MACHINE_UPDATE_TIMESTAMP_LONG,FIELD.NAME <http://FIELD.NAME> FIELD_NAME,FIELD.STR_VALUE FIELD_STR_VALUE,FIELD.LONG_VALUE FIELD_LONG_VALUE from CERBERUS.UPDATE_63402512 MACHINE_UPDATE,CERBERUS.FIELD_63402512 FIELD where FIELD.FK_MACHINE_UPDATE = MACHINE_UPDATE.PK <http://MACHINE_UPDATE.PK> and TIMESTAMP_LONG <= 1313590989001 and TIMESTAMP_LONG >= 1313590969001  ORDER BY MACHINE_UPDATE_TIMESTAMP_LONG

Parse Time: 0
Bind Time: 15
Optimize Time: 78
Generate Time: 0
Compile Time: 93
Execute Time: 32
Begin Compilation Timestamp : 2012-01-26 16 <tel:2012-01-26%2016>:50:13.508
End Compilation Timestamp : 2012-01-26 16 <tel:2012-01-26%2016>:50:13.601
Begin Execution Timestamp : 2012-01-26 16 <tel:2012-01-26%2016>:50:13.601
End Execution Timestamp : 2012-01-26 16 <tel:2012-01-26%2016>:50:13.648

Statement Execution Plan Text:
Project-Restrict ResultSet (6):
Number of opens = 1
Rows seen = 300
Rows filtered = 0
restriction = false
projection = true
   constructor time (milliseconds) = 0
   open time (milliseconds) = 0
   next time (milliseconds) = 32
   close time (milliseconds) = 0
   restriction time (milliseconds) = 0
   projection time (milliseconds) = 0
   optimizer estimated row count:      2836394.00
   optimizer estimated cost:      5501851.37

Source result set:
   Nested Loop Join ResultSet:
   Number of opens = 1
   Rows seen from the left = 100
   Rows seen from the right = 300
   Rows filtered = 0
   Rows returned = 300
       constructor time (milliseconds) = 0
       open time (milliseconds) = 0
       next time (milliseconds) = 32
       close time (milliseconds) = 0
       optimizer estimated row count:      2836394.00
       optimizer estimated cost:      5501851.37

   Left result set:
       Index Row to Base Row ResultSet for UPDATE_63402512:
       Number of opens = 1
       Rows seen = 100
       Columns accessed from heap = {0, 1, 2, 3}
           constructor time (milliseconds) = 0
           open time (milliseconds) = 0
           next time (milliseconds) = 0
           close time (milliseconds) = 0
           optimizer estimated row count:           97.00
           optimizer estimated cost:          190.03

           Index Scan ResultSet for UPDATE_63402512 using index UPDATE_63402512_TIMESTAMP_LONG_IDX at read committed isolation level using share row locking chosen by the optimizer
           Number of opens = 1
           Rows seen = 100
           Rows filtered = 0
           Fetch Size = 1
               constructor time (milliseconds) = 0
               open time (milliseconds) = 0
               next time (milliseconds) = 0
               close time (milliseconds) = 0
               next time in milliseconds/row = 0

           scan information:
               Bit set of columns fetched=All
               Number of columns fetched=2
               Number of deleted rows visited=0
               Number of pages visited=3
               Number of rows qualified=100
               Number of rows visited=101
               Scan type=btree
               Tree height=3
               start position:
>= on first 1 column(s).
   Ordered null semantics on the following columns:
0
               stop position:
> on first 1 column(s).
   Ordered null semantics on the following columns:
0
               qualifiers:
None
               optimizer estimated row count:           97.00
               optimizer estimated cost:          190.03


   Right result set:
       Index Row to Base Row ResultSet for FIELD_63402512:
       Number of opens = 100
       Rows seen = 300
       Columns accessed from heap = {1, 2, 3, 4}
           constructor time (milliseconds) = 0
           open time (milliseconds) = 0
           next time (milliseconds) = 16
           close time (milliseconds) = 16
           optimizer estimated row count:      2836394.00
           optimizer estimated cost:      5501661.34

           Index Scan ResultSet for FIELD_63402512 using constraint fk_FIELD_UPDATE_63402512 at read committed isolation level using share row locking chosen by the optimizer
           Number of opens = 100
           Rows seen = 300
           Rows filtered = 0
           Fetch Size = 1
               constructor time (milliseconds) = 0
               open time (milliseconds) = 0
               next time (milliseconds) = 16
               close time (milliseconds) = 16
               next time in milliseconds/row = 0

           scan information:
               Bit set of columns fetched=All
               Number of columns fetched=2
               Number of deleted rows visited=0
               Number of pages visited=401
               Number of rows qualified=300
               Number of rows visited=400
               Scan type=btree
               Tree height=4
               start position:
>= on first 1 column(s).
   Ordered null semantics on the following columns:
0
               stop position:
> on first 1 column(s).
   Ordered null semantics on the following columns:
0
               qualifiers:
None
               optimizer estimated row count:      2836394.00
               optimizer estimated cost:      5501661.34