db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "A B (JIRA)" <j...@apache.org>
Subject [jira] Updated: (DERBY-2986) Query involving CASE statement significantly slower in 10.3.1.4 than in 10.2.2.0
Date Tue, 07 Aug 2007 17:04:04 GMT

     [ https://issues.apache.org/jira/browse/DERBY-2986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

A B updated DERBY-2986:
-----------------------

    Attachment: d2986_notTested_v1.patch

Thank you for filing this issue, James, and for investigating the cause.  I did some quick
tracing through the code added with DERBY-1620 and it it looks like the problem is that the
code re-binds nested ConditionalNode expressions in an exponential way.

As a simple example, I traced the following query:

  values CASE WHEN 10 = 1 THEN 'a'
              WHEN 10 = 2 THEN 'b'
         END;

Let "CN(1)" denote one ConditionalNode and "CN(2)" denote another ConditionalNode, which is
the "else" part of CN(1).  Then we see the following calls to bind the "then" and "else" expressions:

  -- CN(1) : findType() ==> thenElseList.elementAt(0).bindExpression()
  -- CN(1) : findType() ==> thenElseList.elementAt(1).bindExpression()
    ++ CN(2) : findType() ==> thenElseList.elementAt(0).bindExpression()
    ++ CN(2) : findType() ==> thenElseList.elementAt(1).bindExpression()
    ++ CN(2) : bindExpression() ==> thenElseList.bindExpression() -- 2 DUPLICATE BINDS
  -- CN(1) : bindExpression() ==> thenElseList.bindExpression() -- 2 DUPLICATE BINDS
    ++ CN(2) : findType() ==> thenElseList.elementAt(0).bindExpression() -- DUPLICATE BIND
    ++ CN(2) : findType() ==> thenElseList.elementAt(1).bindExpression() -- DUPLICATE BIND
    ++ CN(2) : bindExpression() ==> thenElseList.bindExpression() -- 2 DUPLICATE BINDS

>From this it's clear that we are unnecessarily rebinding ConditionalNodes that appear
in the "THEN" or "ELSE" clause of outer conditional nodes.  In this case each expression of
CN(2) is bound 2 * 2 = 4 times when we should only be binding it once.  If CN(2) in turn had
another sub-conditional CN(3), then CN(3)'s expressions would be bound 2 * 2 * 2 = 8 times.
 Hence the exponential rebinding.  

I made a quick change to the code to remove the call to thenElseList.bindExpression(...) from
ConditionalNode.bindExpression(...) in cases where we call "findType()", and that seems to
have brought the times back to something more reasonable.  I'm attaching that change as d2986_notTested_v1.patch
as I have not run the regression tests with this change.  I did run lang/CaseExpressionTest,
which was added for DERBY-1620, and that still passes with my quick change.  So if the patch
isn't entirely complete (I won't know that until the full regression suites are run), it should
hopefully be a good starting point... 

> Query involving CASE statement significantly slower in 10.3.1.4 than in 10.2.2.0
> --------------------------------------------------------------------------------
>
>                 Key: DERBY-2986
>                 URL: https://issues.apache.org/jira/browse/DERBY-2986
>             Project: Derby
>          Issue Type: Bug
>          Components: Performance
>    Affects Versions: 10.3.1.4
>         Environment: Windows XP
>            Reporter: James F. Adams
>         Attachments: d2986_notTested_v1.patch
>
>
> A select of a CASE statement that performed acceptably in 10.2.2.0 is very slow in 10.3.1.4
the first time it is executed.
> The following example ij script:
> ELAPSEDTIME ON;
> CREATE table test1(id integer);
> CREATE table test2(id varchar(10));
> SELECT CASE  WHEN t.id = 1 THEN 'a'
>              WHEN t.id = 2 THEN 'b'
>              WHEN t.id = 3 THEN 'c'
>              WHEN t.id = 4 THEN 'd'
>              WHEN t.id = 5 THEN 'e'
>              WHEN t.id = 6 THEN 'f'
>              WHEN t.id = 7 THEN 'g'
>              WHEN t.id = 8 THEN 'h'
>              WHEN t.id = 11 THEN 'i'
>              WHEN t.id = 12 THEN 'j' 
>              WHEN t.id = 15 THEN 'k'
>              WHEN t.id = 16 THEN 'l'
>              WHEN t.id = 23 THEN 'm' 
>              WHEN t.id = 24 THEN 'n'
>              WHEN t.id = 27 THEN 'o'
>              WHEN t.id = 31 THEN 'p'
>              WHEN t.id = 41 THEN 'q'
>              WHEN t.id = 42 THEN 'r'
>              WHEN t.id = 50 THEN 's'
>              ELSE (SELECT t2.id
>                      FROM test2 t2
>                   )
>        END
> FROM test1 t;
> When run on 10.2.2.0 the select results in ELAPSED TIME = 187 milliseconds.
> When run on 10.3.1.4 the select results in ELAPSED TIME = 62281 milliseconds.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message