impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vuk Ercegovac (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-1422: support a constant on LHS of IN predicates.
Date Tue, 07 Nov 2017 08:42:34 GMT
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8322 )

Change subject: IMPALA-1422: support a constant on LHS of IN predicates.
......................................................................


Patch Set 6:

(12 comments)

main change is to simplify the current version to not handle correlated subqueries + "not
in". adding this support so that the rewrite here is simpler requires additional support for
inline view references to outer blocks that are more than one nesting level away.

http://gerrit.cloudera.org:8080/#/c/8322/2/fe/src/main/java/org/apache/impala/analysis/SlotRef.java
File fe/src/main/java/org/apache/impala/analysis/SlotRef.java:

http://gerrit.cloudera.org:8080/#/c/8322/2/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@94
PS2, Line 94:     // TODO: derived slot refs (e.g., star-expanded) will not have rawPath set.
> Why can't this be addressed in this patch?
Not needed any longer... I think its useful to keep the comment around in case another rewrite
hits the issue.


http://gerrit.cloudera.org:8080/#/c/8322/2/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
File fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java:

http://gerrit.cloudera.org:8080/#/c/8322/2/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@287
PS2, Line 287:    * rewritten, null is returned. If 'inPred' is rewritten, the resulting expression
> ... and the RHS is a subquery.
Done


http://gerrit.cloudera.org:8080/#/c/8322/2/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@299
PS2, Line 299:    * 2) Predicate is NOT IN and RHS returns a single row.
> What if the RHS subquery is correlated?
correlation in the RHS is currently checked for and banned for limit queries. for an IN query
with an aggregate, we're currently banning this, but for NOT IN and aggregation (interestingly)
we're grouping by the join condition and pushing the NOT IN check to the having clause. I
don't see a correctness issue with this. We may want to look into the IN handling to see if
we're overly restrictive there.

the comment about correlation on L303 is invalid with regard to the change in this implementation,
so I've removed it.


http://gerrit.cloudera.org:8080/#/c/8322/2/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@319
PS2, Line 319:    *     NOT EXISTS (RHS AND (C IS NULL OR (CASE WHEN e IS NULL THEN C ELSE
e END) = C)
> Is the transformation even correct if RHS is correlated?
I think its correct, but I've removed this case since it would be more cleanly done using
the inline view approach of 3b. I'm not using that approach here since we don't handle references
from inline view to outer blocks that are more than one level away.


http://gerrit.cloudera.org:8080/#/c/8322/6/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
File fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java:

http://gerrit.cloudera.org:8080/#/c/8322/6/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@312
PS6, Line 312:    *    a) No group by or analytic function in subquery.
> Ideally, we should not have to distinguish between cases 3a and 3b, and we 
Simplified all this by removing the case for correlated not in. We can bring it back pending
additional support for multi-block references.


http://gerrit.cloudera.org:8080/#/c/8322/6/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@316
PS6, Line 316:    *    REWRITE:
> Use IFNULL() instead of CASE to simplify.
done, thanks for the suggestion.


http://gerrit.cloudera.org:8080/#/c/8322/6/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@319
PS6, Line 319:    *     NOT EXISTS (RHS AND (C IS NULL OR (CASE WHEN e IS NULL THEN C ELSE
e END) = C)
> I think there's another case where this rewrite is not correct: when the su
good catch-- I forgot to handle order/limit in subqueries. that said, its easy to change the
condition to handle limit in 3b (correlation is not supported for those cases anyways). however,
we're duplicating a bit of push-down logic which is cleaner *not* to do here, so these issues
do not come up anymore.


http://gerrit.cloudera.org:8080/#/c/8322/6/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@321
PS6, Line 321:    *    Example:
> The following query gives me an IllegalStateException:
this change-- thanks for finding it. its handled now.


http://gerrit.cloudera.org:8080/#/c/8322/6/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@328
PS6, Line 328:    *    Note: it useful to think of C NOT IN (RHS) as the boolean expansion:
> it is useful
Done


http://gerrit.cloudera.org:8080/#/c/8322/6/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@340
PS6, Line 340:    *    b) Group by  or analytic function in subquery.
> extra space after "by"
Done


http://gerrit.cloudera.org:8080/#/c/8322/6/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@345
PS6, Line 345:    *      NOT EXISTS (SELECT x FROM (RHS) t WHERE C = t or t IS NOT NULL)
> What does C=t mean here?
goofed on the use of "t". re'orgd.


http://gerrit.cloudera.org:8080/#/c/8322/6/fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java@354
PS6, Line 354:    *     - Assumes that subquery is uncorrelated, which is currently a requirement.
> What does this note refer to? Above you state that "All cases apply to both
reorgd all this.



-- 
To view, visit http://gerrit.cloudera.org:8080/8322
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0d69889a3c72e90be9d4ccf47d2816819ae32acb
Gerrit-Change-Number: 8322
Gerrit-PatchSet: 6
Gerrit-Owner: Vuk Ercegovac <vercegovac@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Gerrit-Reviewer: Vuk Ercegovac <vercegovac@cloudera.com>
Gerrit-Comment-Date: Tue, 07 Nov 2017 08:42:34 +0000
Gerrit-HasComments: Yes

Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message