db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Army <qoz...@gmail.com>
Subject Re: Question on query optimization
Date Fri, 04 May 2007 18:54:36 GMT
Paulo Jesus wrote:
> 
> I have this query
> 
> SELECT J.i_id,J.i_thumbnail from item J where J.i_id in ((SELECT
> i_related1 FROM item WHERE i_id=3556) union (SELECT i_related2 FROM
> item WHERE i_id=3556) union (SELECT i_related3 FROM item WHERE
> i_id=3556) union (SELECT i_related4 FROM item WHERE i_id=3556) union
> (SELECT i_related5 FROM item WHERE i_id=3556));
> 
> ... does anybody see any other possibility on rewriting this query
> besides those presented below ?

Can you indicate which version of Derby you are using?  Does Derby use the index 
for the rewrites that you show below?

> rs = SELECT i_related1,i_related2,i_related3,i_related4 FROM item
> WHERE i_id=3556
> SELECT J.i_id,J.i_thumbnail from item J where J.i_id in (rs)
> 
> or
> 
> tmp_table = SELECT i_related1 as col FROM item WHERE i_id=3556 union
> SELECT i_related2 as col FROM item WHERE i_id=3556 ...
> SELECT J.i_id,J.i_thumbnail from item, tmp_table where i_id = col;
> (Column i_id has a unique index thus guaranteeing that the inner joint
> does not change the result set cardinality.)

If the above queries lead to the desired behavior (i.e. Derby uses the index), 
then you might want to try something like:

SELECT J.i_id, J.i_thumbnail FROM
     item J,
     (SELECT i_related1 as col FROM item WHERE i_id=3556
         UNION SELECT i_related2 FROM item WHERE i_id=3556
         UNION SELECT i_related3 FROM item WHERE i_id=3556
         UNION SELECT i_related4 FROM item WHERE i_id=3556
         UNION SELECT i_related5 FROM item WHERE i_id=3556
     ) X (id)
   WHERE (J.i_id = X.id)

This is similar to the "tmp_table" approach, except that you are putting the 
"temp" query directly into the FROM list instead of using a separate statement. 
  With this statement the optimizer may attempt to materialize the ITEM table 
into memory and then do a hash join with it.  If you don't want that then you 
can use optimizer directives to force use of the index and/or nested loop join 
(assuming it doesn't choose that already).  For ex:

SELECT J.i_id, J.i_thumbnail FROM
     item J --DERBY-PROPERTIES index=I_ID
     , (SELECT i_related1 as col FROM item WHERE i_id=3556
         UNION SELECT i_related2 FROM item WHERE i_id=3556
         UNION SELECT i_related3 FROM item WHERE i_id=3556
         UNION SELECT i_related4 FROM item WHERE i_id=3556
         UNION SELECT i_related5 FROM item WHERE i_id=3556
     ) X (id)
   WHERE (J.i_id = X.id)

That said, though, I wonder if this type of query is an indication that the 
database schema should be re-examined?  I.e. would it make sense to store the 
"related" ids in a separate table somehow so that you can just join directly 
with them?

Army


Mime
View raw message