Return-Path: Delivered-To: apmail-db-derby-dev-archive@www.apache.org Received: (qmail 12844 invoked from network); 5 Mar 2010 17:57:03 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 5 Mar 2010 17:57:03 -0000 Received: (qmail 49051 invoked by uid 500); 5 Mar 2010 17:56:49 -0000 Delivered-To: apmail-db-derby-dev-archive@db.apache.org Received: (qmail 49015 invoked by uid 500); 5 Mar 2010 17:56:49 -0000 Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: Delivered-To: mailing list derby-dev@db.apache.org Received: (qmail 49008 invoked by uid 99); 5 Mar 2010 17:56:49 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Mar 2010 17:56:49 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Mar 2010 17:56:47 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 96FBD234C4D4 for ; Fri, 5 Mar 2010 17:56:27 +0000 (UTC) Message-ID: <1131412396.102411267811787617.JavaMail.jira@brutus.apache.org> Date: Fri, 5 Mar 2010 17:56:27 +0000 (UTC) From: "Tony Brusseau (JIRA)" To: derby-dev@db.apache.org Subject: [jira] Issue Comment Edited: (DERBY-4422) Extremely slow subqueries when subquerying on strings In-Reply-To: <1043357842.1256319539373.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/DERBY-4422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12841949#action_12841949 ] Tony Brusseau edited comment on DERBY-4422 at 3/5/10 5:54 PM: -------------------------------------------------------------- Per the documentation on materialization: http://db.apache.org/derby/docs/10.5/tuning/ctuntransform25857.html I rewrote the query from: SELECT DISTINCT INTERNAL_ID FROM OC_CONCEPTS WHERE (INTERNAL_ID IN (SELECT DISTINCT CONCEPT_ID FROM OC_CONCEPT_STRINGS WHERE NL_LC = 'dash')) and transformed it into: SELECT DISTINCT INTERNAL_ID FROM OC_CONCEPTS, (SELECT DISTINCT CONCEPT_ID FROM OC_CONCEPT_STRINGS WHERE NL_LC = 'dash') X WHERE (OC_CONCEPTS.INTERNAL_ID = X.CONCEPT_ID) Now the query runs blindingly fast. I also made a similar change to the bigger query that I want to do, and got the same dramatic speedup. I guess the real bug here is that the optimizer is not automatically optimizing the queries in such a manner. There may be a secondary issue noted by Bryan where minor changes in the query yields dramatically different optimization strategy. was (Author: apb): Per the documentation on materialization: http://db.apache.org/derby/docs/10.5/tuning/ctuntransform25857.html I rewrote the query from: SELECT DISTINCT INTERNAL_ID FROM OC_CONCEPTS WHERE (INTERNAL_ID IN (SELECT DISTINCT CONCEPT_ID FROM OC_CONCEPT_STRINGS WHERE NL_LC = 'dash')) and transformed it into: SELECT DISTINCT INTERNAL_ID FROM OC_CONCEPTS, (SELECT DISTINCT CONCEPT_ID FROM OC_CONCEPT_STRINGS WHERE NL_LC = 'dash') X WHERE (OC_CONCEPTS.INTERNAL_ID = X.CONCEPT_ID) Now the query runs blindingly fast. I also made a similar change to the bigger query that I want to do, and got the same dramatic speedup. I guess the real bug here is that the optimizer is not automatically optimizing the queries in such a manner. There may be a secondary issue noted by Bryan where minor changes in the query yields dramatically different optimizations. > Extremely slow subqueries when subquerying on strings > ----------------------------------------------------- > > Key: DERBY-4422 > URL: https://issues.apache.org/jira/browse/DERBY-4422 > Project: Derby > Issue Type: Bug > Affects Versions: 10.5.3.0 > Environment: Suse Unix 10.3. > Reporter: Tony Brusseau > Attachments: badqp.txt, derby.log, fixed-time-derby.log, goodqp.txt, SQLFile1.sqlaa.gz, SQLFile1.sqlab.gz, SQLFile2.sqlaa.gz, SQLFile2.sqlab.gz, SQLFile2.sqlac.gz, tmp2.sql, unlimited-time-derby.log > > > SELECT DISTINCT C.INTERNAL_ID > FROM OC_CONCEPTS C > WHERE > (C.INTERNAL_ID IN (SELECT DISTINCT OCS.CONCEPT_ID FROM OC_CONCEPT_STRINGS OCS WHERE OCS.CONCEPT_ID=1)) > Both queries above are trivial and they are both acting on fully indexed primary key columns. This query runs just fine. > However, when I make the small modification of searching on an indexed text column in the sub query, it takes over 6 seconds to run, even though both queries independently take < 1ms run. > SELECT DISTINCT C.INTERNAL_ID > FROM OC_CONCEPTS C > WHERE > (C.INTERNAL_ID IN (SELECT DISTINCT OCS.CONCEPT_ID FROM OC_CONCEPT_STRINGS OCS WHERE OCS.NL_LC = 'dash')) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.