impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Behm (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5003: Constant propagation in scan nodes
Date Wed, 05 Apr 2017 05:19:20 GMT
Alex Behm has posted comments on this change.

Change subject: IMPALA-5003: Constant propagation in scan nodes
......................................................................


Patch Set 15:

(14 comments)

Code is looking good, will take another pass over the tests

http://gerrit.cloudera.org:8080/#/c/6389/15/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

Line 923:    * Propagates constant expressions of the form <slot ref> = <constant>
to
Explain candidate param


Line 924:    * other uses of slot ref in all expressions in conjuncts; returns a BitSet with
in all expressions in conjuncts -> in the given conjnucts


Line 942:         // Don't rewrite with our own substitution!  We may have already processed
Explanation seems more complicated than necessary. Clearly it would be wrong to propagate
a constant to the originating binary predicate.

Your function comment is quite accurate "to other uses of slot ref" :)


Line 960:    * Returns true if the conjuncts may be true and false if a contradiction has
Shorter: Returns false if a contradiction has been implied, true otherwise.


Line 968:       BitSet candidates = new BitSet(conjuncts.size());
> Alex, thanks for the BitSet suggest - this worked out far cleaner than any 
Yea, this looks pretty good to me!


Line 980:           // applied by the rewriter.  We must also re-analyze result exprs to
I think we can remove everything after "We must also re-analyze  ..." to keep the comment
short.

Also, I think constant propagation should fall under the "enable_expr_rewrites" query option,
i.e., if false we should only remove duplicates in here. Otherwise, it's hard to say what
that query option means at a high level.


Line 985:           if (index < 0) continue;
> This can be Preconditions.checkState(index >= 0) now.  In the prior form of
Agree. Makes sense.


Line 988:           // Reset Expr to force updating cost
Re-analyze expr to add implicit casts, resolve function signatures, and compute cost


Line 998:           // because they can't be evaluated if expr rewriting is turned off.
> I can't see a good way to test this with the fe tests.  Should we write a c
You should be able to do this with a PlannerTest. For example, look at PlannerTest.testParquetFiltering()
which sets a query option and then runs a .test file.

But the easier solution, imo, is to not propagate constants when expr rewriting is disabled.


http://gerrit.cloudera.org:8080/#/c/6389/11/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

Line 1266: 
> Optimization should still be done in the form of eliminating duplicates, ev
Looks like the constant propagation is applied even when expr rewrites are disabled. Imo,
it would be preferable not to propagate constants in that case. Otherwise, it's kind of hard
to reason about the meaning of that query option.


http://gerrit.cloudera.org:8080/#/c/6389/15/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

Line 1273:     // Allow constant propagation within conjuncts.
This actually has two purposes:
1. constant propagation
2. expr optimization on fully-resolved conjuncts


http://gerrit.cloudera.org:8080/#/c/6389/15/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
File testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test:

Line 101: # XXX - this seems correct, but is this a legal transformation?
> I'm fairly convinced at this point that considering slotrefs through implic
I agree that implicit casts on both sides seem ok.

It does not seem fine if there is an explicit cast on the slotref.
In your example, the 'cast(10000 as tinyint)' is constant and will be folded (truncated),
so this case seems fine.


http://gerrit.cloudera.org:8080/#/c/6389/15/testdata/workloads/functional-planner/queries/PlannerTest/joins.test
File testdata/workloads/functional-planner/queries/PlannerTest/joins.test:

Line 381: # hbase-hdfs join with scan filtering (bogus)
Can you be more specific than "(bogus)"? What behavior are we testing here?


Line 394: 02:HASH JOIN [INNER JOIN]
> This is a very sad plan.
Agree that this can be improved.


-- 
To view, visit http://gerrit.cloudera.org:8080/6389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4
Gerrit-PatchSet: 15
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden <zamsden@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <marcel@cloudera.com>
Gerrit-Reviewer: Zach Amsden <zamsden@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message