Return-Path: Delivered-To: apmail-db-derby-user-archive@www.apache.org Received: (qmail 34033 invoked from network); 4 Feb 2008 09:10:20 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 4 Feb 2008 09:10:20 -0000 Received: (qmail 8342 invoked by uid 500); 4 Feb 2008 09:10:11 -0000 Delivered-To: apmail-db-derby-user-archive@db.apache.org Received: (qmail 8315 invoked by uid 500); 4 Feb 2008 09:10:11 -0000 Mailing-List: contact derby-user-help@db.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: List-Id: Reply-To: "Derby Discussion" Delivered-To: mailing list derby-user@db.apache.org Received: (qmail 8304 invoked by uid 99); 4 Feb 2008 09:10:11 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Feb 2008 01:10:11 -0800 X-ASF-Spam-Status: No, hits=-1.0 required=10.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [192.18.6.21] (HELO gmp-eb-mail-1.sun.com) (192.18.6.21) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Feb 2008 09:09:40 +0000 Received: from fe-emea-09.sun.com (gmp-eb-lb-2-fe1.eu.sun.com [192.18.6.10]) by gmp-eb-mail-1.sun.com (8.13.7+Sun/8.12.9) with ESMTP id m1499j8A020743 for ; Mon, 4 Feb 2008 09:09:45 GMT Received: from conversion-daemon.fe-emea-09.sun.com by fe-emea-09.sun.com (Sun Java System Messaging Server 6.2-8.04 (built Feb 28 2007)) id <0JVP00301HW4NK00@fe-emea-09.sun.com> (original mail from Knut.Hatlen@Sun.COM) for derby-user@db.apache.org; Mon, 04 Feb 2008 09:09:45 +0000 (GMT) Received: from localhost ([129.159.112.134]) by fe-emea-09.sun.com (Sun Java System Messaging Server 6.2-8.04 (built Feb 28 2007)) with ESMTPSA id <0JVP0069RIS596G0@fe-emea-09.sun.com> for derby-user@db.apache.org; Mon, 04 Feb 2008 09:09:42 +0000 (GMT) Date: Mon, 04 Feb 2008 10:09:41 +0100 From: Knut Anders Hatlen Subject: Re: Nested query in conjunction with ordering - unreasonably slow In-reply-to: <200802041008.33862.daniel@nuix.com> Sender: Knut.Hatlen@Sun.COM To: Derby Discussion Message-id: Organization: Sun Microsystems MIME-version: 1.0 Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7BIT References: <200802041008.33862.daniel@nuix.com> User-Agent: Gnus/5.110007 (No Gnus v0.7) Emacs/22.1 (usg-unix-v) X-Virus-Checked: Checked by ClamAV on apache.org Daniel Noll writes: > Hi all. > > I've been trying to determine the cause for a slowness in a certain query. > I've managed to strip everything away and get to the core of what's making it > slow, but I'd like to understand why. > > The query is of this form: > > SELECT jobs.id AS id, > jobs.name AS name, > (SELECT COUNT(1) FROM jobitems > WHERE jobitems.jobid = jobs.id) AS count > FROM jobs > ORDER BY jobs.name > > This takes around 3000ms to complete if there are 200,000 rows in jobitems. > > If I remove the "count" subquery, then it runs in 1ms. If I remove the ORDER > BY, it runs in 1ms. > > There is only one row in jobs, thus I would expect the same performance > whether I order it or not. The PK on jobitems is (jobid,itemid); prior posts > on this forum have led me to believe that this creates an implicit UNIQUE > INDEX on (jobid,itemid) and also an implicit INDEX on (jobid). Whether this > is the case or not, creating an additional INDEX on (jobid) itself doesn't > speed things up. > > If I remove the count part and then execute it by itself, I get a total query > time of 3ms for the two queries. No nested query should take longer than it > takes to execute its parts separately. > > I discovered this issue on version 10.3.1.4 but it still performs the > same way under version 10.3.2.1. It sounds like the optimizer is making a bad decision for some reason, but I don't know why (perhaps the statistics are out of date? See http://thread.gmane.org/gmane.comp.apache.db.derby.user/8100/focus=8101). Since you seem to have a small reproducible test case for this problem, it would be great if you could file a JIRA issue and upload one sql script that creates the tables/indices and insert enough sample data to show the problem, and one sql script that runs the different queries. > The obvious workaround is not to order by name and sort the results on the > Java side. But is there a way to rewrite this query that will make Derby > happier? I don't know if it makes Derby happier, but I think this query does the same thing without the nested query: SELECT jobs.id, jobs.name, count(*) FROM jobs, jobitems WHERE jobs.id = jobitems.jobid GROUP BY jobs.id, jobs.name ORDER BY jobs.name -- Knut Anders