db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paulo Jesus" <paulsch...@gmail.com>
Subject Re: Question on query optimization
Date Sat, 05 May 2007 17:06:58 GMT
2007/5/4, Army <qozinx@gmail.com>:
> Paulo Jesus wrote:
> >
> > I have this query
> >
> > SELECT J.i_id,J.i_thumbnail from item Jdb-derby-10.2.2.0-bin t where J.i_id in ((SELECT
> > i_related1 FROM item WHERE i_id=3556) union (SELECT i_related2 FROM
> > item WHERE i_id=3556) union (SELECT i_related3 FROM item WHERE
> > i_id=3556) union (SELECT i_related4 FROM item WHERE i_id=3556) union
> > (SELECT i_related5 FROM item WHERE i_id=3556));
> >
> > ... does anybody see any other possibility on rewriting this query
> > besides those presented below ?
>
> Can you indicate which version of Derby you are using?

I'm using a derby network framework  version 10.2.2.0

> Does Derby use the index for the rewrites that you show below?

I can't say now. The rewrites suggestion had been from the person
witch I'm working it.

>
> > rs = SELECT i_related1,i_related2,i_related3,i_related4 FROM item
> > WHERE i_id=3556
> > SELECT J.i_id,J.i_thumbnail from item J where J.i_id in (rs)
> >
> > or
> >
> > tmp_table = SELECT i_related1 as col FROM item WHERE i_id=3556 union
> > SELECT i_related2 as col FROM item WHERE i_id=3556 ...
> > SELECT J.i_id,J.i_thumbnail from item, tmp_table where i_id = col;
> > (Column i_id has a unique index thus guaranteeing that the inner joint
> > does not change the result set cardinality.)
>
> If the above queries lead to the desired behavior (i.e. Derby uses the index),
> then you might want to try something like:
>
> SELECT J.i_id, J.i_thumbnail FROM
>      item J,
>      (SELECT i_related1 as col FROM item WHERE i_id=3556
>          UNION SELECT i_related2 FROM item WHERE i_id=3556
>          UNION SELECT i_related3 FROM item WHERE i_id=3556
>          UNION SELECT i_related4 FROM item WHERE i_id=3556
>          UNION SELECT i_related5 FROM item WHERE i_id=3556
>      ) X (id)
>    WHERE (J.i_id = X.id)
>
> This is similar to the "tmp_table" approach, except that you are putting the
> "temp" query directly into the FROM list instead of using a separate statement.
>   With this statement the optimizer may attempt to materialize the ITEM table
> into memory and then do a hash join with it.  If you don't want that then you
> can use optimizer directives to force use of the index and/or nested loop join
> (assuming it doesn't choose that already).  For ex:
>
> SELECT J.i_id, J.i_thumbnail FROM
>      item J --DERBY-PROPERTIES index=I_ID
>      , (SELECT i_related1 as col FROM item WHERE i_id=3556
>          UNION SELECT i_related2 FROM item WHERE i_id=3556
>          UNION SELECT i_related3 FROM item WHERE i_id=3556
>          UNION SELECT i_related4 FROM item WHERE i_id=3556
>          UNION SELECT i_related5 FROM item WHERE i_id=3556
>      ) X (id)
>    WHERE (J.i_id = X.id)

Your suggestion work without forcing optimizer.
But this optimizer directive will be very useful.

> That said, though, I wonder if this type of query is an indication that the
> database schema should be re-examined?  I.e. would it make sense to store the
> "related" ids in a separate table somehow so that you can just join directly
> with them?

This is from TPC-W test. We are trying to understand the performance
problems in derby.

PJ


> Army
>
>

Mime
View raw message