db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Knut Anders Hatlen (JIRA)" <j...@apache.org>
Subject [jira] Updated: (DERBY-4416) Handle comparison of two constants as a boolean constant
Date Fri, 13 Nov 2009 16:49:39 GMT

     [ https://issues.apache.org/jira/browse/DERBY-4416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Knut Anders Hatlen updated DERBY-4416:

    Attachment: d4416-1a.stat

Here's an updated patch with the visitor renamed to ConstantExpressionVisitor, the method
in ValueNode renamed to evaluateConstantExpression(), and with updated comments.

Description of the other changes from the previous patch:

1) The invocation of the visitor is moved to DMLStatementNode.optimizeStatement(). The previous
patch called it from SelectNode and JoinNode on WHERE/HAVING/ON clauses. As mentioned by Bryan,
this transformation can be useful in other parts of the tree as well (as seen in VALUES 1+3,
although evaluation of arithmetic operations hasn't implemented). I think it's cleaner to
have the invocation at one single location. Also, since there might be sub-queries inside
WHERE clauses, the old patch might traverse some parts of the tree many times.

The visitor now walks the entire query tree between the preprocessing phase and the optimization
phase. Doing it as late as possible is advantageous because then other simplifications (like
the NOT elimination performed in preprocess()) increase the chances of finding expressions
with a know value, but it should be performed before optimization so that the optimizer can
take advantage of the more accurate selectivity predictions.

2) JoinNode: Made acceptChildren() call accept() on joinPredicates. The previous patch visited
the JoinNode a little earlier, when the ON clause was still represented by joinClause, whereas
this patch visits the JoinNode after the predicates have been moved to joinPredicates. There
are also other children of JoinNode that are not visited, but I didn't do anything with them
for now.

3) BinaryRelationalOperatorNode: Added rewriting for less-than and less-equals, which were
forgotten in the previous patch.

4) logop.out: Accept that a statement that used to fail with arithmetic overflow now succeeds
because the arithmetic isn't performed at runtime. The statement that fails is
> select x from s where 2147483647 + 10 = 2 and (1=2);
whereas this almost identical one is expected to succeed
> select x from s where (1=2) and 2147483647 + 10 = 2;
I'm not aware of anything in the SQL standard that requires us to evaluate the arithmetic
expression first, so I think it's equally OK to pass the first statement as it is to pass
the second statement. The comments in the test also indicate this.

4) specjPlans.out: Removed unneeded ProjectRestrictResultSet from two of the query plans (they
were there to enforce the restriction 1=1).

5) outerjoin.out: Updated query plan from nested loop join to hash join in a join with 1=1
in the ON clause. New plan looks OK and the results are still correct. The plan probably changed
because 1=1 and TRUE have different (estimated) selectivity.

I haven't run all the regression tests on the latest revision of the patch. Will report back
when I have the results.

> Handle comparison of two constants as a boolean constant
> --------------------------------------------------------
>                 Key: DERBY-4416
>                 URL: https://issues.apache.org/jira/browse/DERBY-4416
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions:
>            Reporter: Knut Anders Hatlen
>            Assignee: Knut Anders Hatlen
>            Priority: Minor
>         Attachments: d4416-1a.diff, d4416-1a.stat, replaceExpressions.diff
> In the lack of the boolean data type, Derby forces you to use expressions like 1=1 and
1<>1 to express true and false. Generated SQL statements also tend to use such expressions,
and so does Derby in its own meta-data queries.
> Derby has many useful optimizations for boolean true/false. For instance, ProjectRestrictNode
and PredicateList are able to eliminate predicates, and in some cases the entire ProjectRestrictNode,
if the predicate contains constant true or false values. However, during parsing and compilation,
expressions like 1=1 are not rewritten to TRUE, and we don't get any benefit from the boolean
optimization code. This leads to more complex, and possibly less efficient, byte code being
generated for the statements.
> Also, boolean constants are assigned a selectivity of 0.0 (false) or 1.0 (true), since
they will always match no rows when false and all rows when true. The expression 1=1 does
however get it's selectivity from the = operator, which means that it'll be 0.1. The same
selectivity is assigned to 1=0. Other operators have different selectivity, so 2<3 has
the selectivity 0.33, even though the actual selectivity of the expression is the same as
1=1 and TRUE, namely 1.0.
> This leads to oddities like the optimizer choosing a different plan when you change 2<3
to 1=1 in a WHERE clause. See http://mail-archives.apache.org/mod_mbox/db-derby-user/200909.mbox/%3c25531166.post@talk.nabble.com%3e
for an example of that.
> If we could go through the query tree and replace occurrences of comparisons between
constant values with a boolean constant at bind time, such queries would end up with simpler
byte code, and the selectivity passed to the optimizer would be more accurate, possibly resulting
in a better plan being chosen.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message