db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brett Wooldridge <brett.wooldri...@gmail.com>
Subject Re: SELECT query takes 5 secs, what can I do?
Date Wed, 16 Sep 2009 01:42:54 GMT
The original query:

select * from Message m where
   m.messageId='<7997716ED1AF3D47A35D74FA2CB610920255303F@somedomain.com>'
   and m.id != 933927 and m.archived=1

and the sub-select query:

SELECT * from Message where ID IN (SELECT ID FROM Message  where
    messageId = '<7997716ED1AF3D47A35D74FA2CB61092E2A619@somedomaion.local
>')
    and archived=1 and id!=987452

are equivalent with respect to the database.  My comment regarding the 'id
!= 987452' clause was only with respect to an index.  It's inclusion in the
query (original) does not impact performance in a negative way.

The optimizer is going to choose to use the index (on messageId), and will
perform that select first, then it will (likely) join that sub-result
against the other two clauses (archived=1 and id != 987452) -- making it
equivalent to your second query.

Do you know how to get a query plan?
http://wiki.apache.org/db-derby/PerformanceDiagnosisTips

The portion of the query with the messageId should have high-specificity due
to the relative uniqueness of the message within the context of all
messages.  For example, you say the table has 1 million rows, and a given
messsage in an organization of 200 users would consume 200 rows, so as you
can see the specificity is very high.  I would not expect that kind of
select to be taking 5 seconds, but rather milliseconds.  It is likely that
your index statistics are seriously out of date (see
http://issues.apache.org/jira/browse/DERBY-269).  One user reported that
updating statistics took a 22 minute query down to less than one second!

You can force Derby to update statistics with this command:

alter table <table-name> compress [sequential]

Note this command itself might take a long time -- dozens of minutes -- but
in a system like yours you could get away with running it once or month or
so at some off-peak time.

If you have a test database in which your query is taking 5 seconds, attempt
to run the above alter table, and then re-run your query.  If your query
times are still low, post the query plan for the query here and we can take
a look at it.

Brett

Mime
View raw message