db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Knut Anders Hatlen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-4422) Extremely slow subqueries when subquerying on strings
Date Mon, 08 Mar 2010 17:25:27 GMT

    [ https://issues.apache.org/jira/browse/DERBY-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842743#action_12842743
] 

Knut Anders Hatlen commented on DERBY-4422:
-------------------------------------------

I put a break point in SubqueryNode.preprocess() and checked what happened during compilation
of the query that used the IN operator. It turns out that it does actually flatten the query,
but only to an exists join and not to a normal join. When I hard-coded that method to pick
a normal join instead of an exists join the query ran just as fast as the manually written
join above. So I guess what we need to find out is why the exists join is so much slower than
the normal join in this case.

(I think the optimizer is correct here in picking an exists join instead of a normal join.
My understanding from reading the tuning guide and the comments in the code is that flattening
to a normal join would only work correctly if there were a unique constraint or similar on
OC_CONCEPT_STRINGS.CONCEPT_ID.)

> Extremely slow subqueries when subquerying on strings
> -----------------------------------------------------
>
>                 Key: DERBY-4422
>                 URL: https://issues.apache.org/jira/browse/DERBY-4422
>             Project: Derby
>          Issue Type: Bug
>    Affects Versions: 10.5.3.0
>         Environment: Suse Unix 10.3.
>            Reporter: Tony Brusseau
>         Attachments: badqp.txt, derby.log, fixed-time-derby.log, goodqp.txt, SQLFile1.sqlaa.gz,
SQLFile1.sqlab.gz, SQLFile2.sqlaa.gz, SQLFile2.sqlab.gz, SQLFile2.sqlac.gz, tmp2.sql, unlimited-time-derby.log
>
>
> SELECT DISTINCT C.INTERNAL_ID 
> FROM OC_CONCEPTS C
> WHERE
>    (C.INTERNAL_ID IN (SELECT DISTINCT OCS.CONCEPT_ID FROM OC_CONCEPT_STRINGS OCS WHERE
OCS.CONCEPT_ID=1))
> Both queries above are trivial and they are both acting on fully indexed primary key
columns. This query runs just fine. 
> However, when I make the small modification of searching on an indexed text column in
the sub query, it takes over 6 seconds to run, even though both queries independently take
< 1ms run.
> SELECT DISTINCT C.INTERNAL_ID 
> FROM OC_CONCEPTS C
> WHERE
>    (C.INTERNAL_ID IN (SELECT DISTINCT OCS.CONCEPT_ID FROM OC_CONCEPT_STRINGS OCS WHERE
OCS.NL_LC = 'dash'))

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message