db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bryan Pendleton (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-4422) Extremely slow subqueries when subquerying on strings
Date Tue, 10 Nov 2009 16:05:30 GMT

    [ https://issues.apache.org/jira/browse/DERBY-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12775923#action_12775923
] 

Bryan Pendleton commented on DERBY-4422:
----------------------------------------

Does this indicate a weakness in the Derby query-flattening algorithm? I read through
http://db.apache.org/derby/docs/10.5/tuning/ctuntransform36368.html
and was unsure whether Derby should have been able to flatten this query
automatically or not. It seems like the crucial bit is:

  "There is a uniqueness condition that ensures that the subquery does not introduce 
   any duplicates if it is flattened into the outer query block"

But I think that the DISTINCT in the *outer* query should have provided such a guarantee.

I think that the DISTINCT in the *subquery* is unnecessary, due to:
http://db.apache.org/derby/docs/10.5/tuning/ctuntransform867165.html


> Extremely slow subqueries when subquerying on strings
> -----------------------------------------------------
>
>                 Key: DERBY-4422
>                 URL: https://issues.apache.org/jira/browse/DERBY-4422
>             Project: Derby
>          Issue Type: Bug
>    Affects Versions: 10.5.3.0
>         Environment: Suse Unix 10.3.
>            Reporter: Tony Brusseau
>         Attachments: badqp.txt, derby.log, fixed-time-derby.log, goodqp.txt, SQLFile1.sqlaa.gz,
SQLFile1.sqlab.gz, SQLFile2.sqlaa.gz, SQLFile2.sqlab.gz, SQLFile2.sqlac.gz, tmp2.sql, unlimited-time-derby.log
>
>
> SELECT DISTINCT C.INTERNAL_ID 
> FROM OC_CONCEPTS C
> WHERE
>    (C.INTERNAL_ID IN (SELECT DISTINCT OCS.CONCEPT_ID FROM OC_CONCEPT_STRINGS OCS WHERE
OCS.CONCEPT_ID=1))
> Both queries above are trivial and they are both acting on fully indexed primary key
columns. This query runs just fine. 
> However, when I make the small modification of searching on an indexed text column in
the sub query, it takes over 6 seconds to run, even though both queries independently take
< 1ms run.
> SELECT DISTINCT C.INTERNAL_ID 
> FROM OC_CONCEPTS C
> WHERE
>    (C.INTERNAL_ID IN (SELECT DISTINCT OCS.CONCEPT_ID FROM OC_CONCEPT_STRINGS OCS WHERE
OCS.NL_LC = 'dash'))

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message