Return-Path: Delivered-To: apmail-db-derby-dev-archive@www.apache.org Received: (qmail 94745 invoked from network); 21 Apr 2010 16:12:12 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 21 Apr 2010 16:12:12 -0000 Received: (qmail 12386 invoked by uid 500); 21 Apr 2010 16:12:12 -0000 Delivered-To: apmail-db-derby-dev-archive@db.apache.org Received: (qmail 12364 invoked by uid 500); 21 Apr 2010 16:12:12 -0000 Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: Delivered-To: mailing list derby-dev@db.apache.org Received: (qmail 12357 invoked by uid 99); 21 Apr 2010 16:12:12 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Apr 2010 16:12:12 +0000 X-ASF-Spam-Status: No, hits=-1324.9 required=10.0 tests=ALL_TRUSTED,AWL X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Apr 2010 16:12:10 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o3LGBoce002934 for ; Wed, 21 Apr 2010 16:11:50 GMT Message-ID: <9674011.117581271866310485.JavaMail.jira@thor> Date: Wed, 21 Apr 2010 12:11:50 -0400 (EDT) From: "Rick Hillegas (JIRA)" To: derby-dev@db.apache.org Subject: [jira] Commented: (DERBY-2130) Optimizer performance slowdown from 10.1 to 10.2 In-Reply-To: <896536.1164840923060.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/DERBY-2130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859413#action_12859413 ] Rick Hillegas commented on DERBY-2130: -------------------------------------- Hi Bryan, You contributed a patch for this issue more than 3 years ago but the patch probably needs to be updated before it can be applied now. Are you interested in pursuing this? Thanks, -Rick > Optimizer performance slowdown from 10.1 to 10.2 > ------------------------------------------------ > > Key: DERBY-2130 > URL: https://issues.apache.org/jira/browse/DERBY-2130 > Project: Derby > Issue Type: Bug > Components: SQL > Affects Versions: 10.1.3.1, 10.2.1.6, 10.3.1.4 > Reporter: Bryan Pendleton > Attachments: jumpReset.patch, plan10_1_2_1.txt, plan10_2.txt, plans.diff, repro.sql > > > Attached is 'repro.sql', an IJ script which demonstrates what I > believe to be a serious performance issue in the Optimizer. > I have run this script in a number of configurations: > - 10.1.2.1: the script runs successfully. The 'prepare' statement > takes about 90 seconds, on a fairly powerful Windows machine > - 10.1.3.1: the script produces a NPE. I believe this is DERBY-1777 > - 10.2.1.8/trunk: the script runs successfully. The 'prepare' statement > often takes about 220 seconds, on the same Windows machine > Intermittently, on 10.2 and on the trunk, the prepare statement takes > 15+ minutes. I cannot reliably reproduce this; I run the same script > several times in a row and I cannot predict whether it will take 220 > seconds or whether it will take 15+ minutes. > I am quite motivated to work on this problem, as this is blocking me from > using Derby for a project that I'm quite keen on, but I need some > suggestions and ideas about how to attack it. From my perspective > there are 3 primary topics: > 1) Why did optimizer performance for this query degrade so significantly > from 10.1.2.1 to 10.2? The optimizer seems to be at least 2.5 times slower, > for this particular query at least, in 10.2. Sometimes it is 10x slower. > 2) What is the source of the non-determinism? Why does the optimizer > often take 4 minutes to optimize this query on the trunk, but sometimes > take 15+ minutes? I don't believe that I'm changing anything from > run to run. > 3) Can we improve the optimizer performance even beyond what it was > for 10.1.2? I realize that this is an ugly query, but I was hoping to > see an optimization time of 5-10 seconds, not 90 seconds (and certainly > not 220 seconds). > I have attempted to start answering some of these questions, with > limited success. Here is some of what I think I've discovered so far: > - the optimizer changes in 10.2 seem to have given the optimizer many > more choices of possible query plans to consider. I think this means > that, if the optimizer does not time out, it will spend substantially > more time optimizing because there are more choices to evaluate. Does > this by itself mean that the optimizer will take 2.5 times longer in > 10.2 than it did in 10.1? > - something about this query seems to make the costing mechanism go > haywire, and produce extreme costs. While stepping through the > optimization of this query in the debugger I have seen it compute > costs like 1e63 and 1e200. This might be very closely related to > DERBY-1905, although I don't think I'm doing any subqueries here. > But maybe I'm misunderstanding the term "subquery" in DERBY-1905. > At any rate, due to the enormous estimated costs, timeout does not > occur. > - the WHERE clause in this query is converted during compilation to > an equivalent IN clause, I believe, which then causes me to run into > a number of the problems described in DERBY-47 and DERBY-713. > Specifically, rather than constructing a plan which involves 4 > index probes for the 4 WHERE clause values, the optimizer decides > that an index scan must be performed and that it will have to process > the entire index (because the query uses parameter markers, not > literal values). So perhaps solving DERBY-47 would help me > - the optimizer in fact comes up with a "decent" query plan quite quickly. > I have experimented with placing a hard limit into the optimizer > timeout code, so that I can force optimization to stop after an > arbitrary fixed period of time. Then I have been able to set that > value to as low as 1 second, and the optimizer has produced plans > that then execute in a few milliseconds. Of course, I have only tried > this with a trivial amount of data in my database, so it's possible > that the plan produced by the optimizer after just a second of > optimizing is in fact poor, and I'm just not noticing it because my > data sizes are so small. > At this point, what would be really helpful to me would be some suggestions > about some general approaches or techniques to try to start breaking down > and analyzing this problem. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.