Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 48AD4200C5B for ; Wed, 5 Apr 2017 07:19:26 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 471F3160BA1; Wed, 5 Apr 2017 05:19:26 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 67F15160B90 for ; Wed, 5 Apr 2017 07:19:25 +0200 (CEST) Received: (qmail 14116 invoked by uid 500); 5 Apr 2017 05:19:24 -0000 Mailing-List: contact reviews-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list reviews@impala.incubator.apache.org Received: (qmail 14093 invoked by uid 99); 5 Apr 2017 05:19:24 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Apr 2017 05:19:24 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id D38091813C0 for ; Wed, 5 Apr 2017 05:19:23 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.363 X-Spam-Level: X-Spam-Status: No, score=0.363 tagged_above=-999 required=6.31 tests=[RDNS_DYNAMIC=0.363, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id PSdFSqEQOytC for ; Wed, 5 Apr 2017 05:19:22 +0000 (UTC) Received: from ip-10-146-233-104.ec2.internal (ec2-75-101-130-251.compute-1.amazonaws.com [75.101.130.251]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id C10C75FCB9 for ; Wed, 5 Apr 2017 05:19:21 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ip-10-146-233-104.ec2.internal (8.14.4/8.14.4) with ESMTP id v355JKZd001248; Wed, 5 Apr 2017 05:19:20 GMT Message-Id: <201704050519.v355JKZd001248@ip-10-146-233-104.ec2.internal> Date: Wed, 5 Apr 2017 05:19:20 +0000 From: "Alex Behm (Code Review)" To: Zach Amsden , impala-cr@cloudera.com, reviews@impala.incubator.apache.org CC: Dan Hecht , Marcel Kornacker Reply-To: alex.behm@cloudera.com X-Gerrit-MessageType: comment Subject: =?UTF-8?Q?=5BImpala-ASF-CR=5D_IMPALA-5003=3A_Constant_propagation_in_scan_nodes=0A?= X-Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4 X-Gerrit-ChangeURL: X-Gerrit-Commit: 6eb1671ec7fd329d63fd693b2deccb8c5d3822c0 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-Disposition: inline User-Agent: Gerrit/2.12.7 archived-at: Wed, 05 Apr 2017 05:19:26 -0000 Alex Behm has posted comments on this change. Change subject: IMPALA-5003: Constant propagation in scan nodes ...................................................................... Patch Set 15: (14 comments) Code is looking good, will take another pass over the tests http://gerrit.cloudera.org:8080/#/c/6389/15/fe/src/main/java/org/apache/impala/analysis/Expr.java File fe/src/main/java/org/apache/impala/analysis/Expr.java: Line 923: * Propagates constant expressions of the form = to Explain candidate param Line 924: * other uses of slot ref in all expressions in conjuncts; returns a BitSet with in all expressions in conjuncts -> in the given conjnucts Line 942: // Don't rewrite with our own substitution! We may have already processed Explanation seems more complicated than necessary. Clearly it would be wrong to propagate a constant to the originating binary predicate. Your function comment is quite accurate "to other uses of slot ref" :) Line 960: * Returns true if the conjuncts may be true and false if a contradiction has Shorter: Returns false if a contradiction has been implied, true otherwise. Line 968: BitSet candidates = new BitSet(conjuncts.size()); > Alex, thanks for the BitSet suggest - this worked out far cleaner than any Yea, this looks pretty good to me! Line 980: // applied by the rewriter. We must also re-analyze result exprs to I think we can remove everything after "We must also re-analyze ..." to keep the comment short. Also, I think constant propagation should fall under the "enable_expr_rewrites" query option, i.e., if false we should only remove duplicates in here. Otherwise, it's hard to say what that query option means at a high level. Line 985: if (index < 0) continue; > This can be Preconditions.checkState(index >= 0) now. In the prior form of Agree. Makes sense. Line 988: // Reset Expr to force updating cost Re-analyze expr to add implicit casts, resolve function signatures, and compute cost Line 998: // because they can't be evaluated if expr rewriting is turned off. > I can't see a good way to test this with the fe tests. Should we write a c You should be able to do this with a PlannerTest. For example, look at PlannerTest.testParquetFiltering() which sets a query option and then runs a .test file. But the easier solution, imo, is to not propagate constants when expr rewriting is disabled. http://gerrit.cloudera.org:8080/#/c/6389/11/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java: Line 1266: > Optimization should still be done in the form of eliminating duplicates, ev Looks like the constant propagation is applied even when expr rewrites are disabled. Imo, it would be preferable not to propagate constants in that case. Otherwise, it's kind of hard to reason about the meaning of that query option. http://gerrit.cloudera.org:8080/#/c/6389/15/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java: Line 1273: // Allow constant propagation within conjuncts. This actually has two purposes: 1. constant propagation 2. expr optimization on fully-resolved conjuncts http://gerrit.cloudera.org:8080/#/c/6389/15/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test File testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test: Line 101: # XXX - this seems correct, but is this a legal transformation? > I'm fairly convinced at this point that considering slotrefs through implic I agree that implicit casts on both sides seem ok. It does not seem fine if there is an explicit cast on the slotref. In your example, the 'cast(10000 as tinyint)' is constant and will be folded (truncated), so this case seems fine. http://gerrit.cloudera.org:8080/#/c/6389/15/testdata/workloads/functional-planner/queries/PlannerTest/joins.test File testdata/workloads/functional-planner/queries/PlannerTest/joins.test: Line 381: # hbase-hdfs join with scan filtering (bogus) Can you be more specific than "(bogus)"? What behavior are we testing here? Line 394: 02:HASH JOIN [INNER JOIN] > This is a very sad plan. Agree that this can be improved. -- To view, visit http://gerrit.cloudera.org:8080/6389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I79750a8edb945effee2a519fa3b8192b77042cb4 Gerrit-PatchSet: 15 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes