Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id BA9FC200BB9 for ; Mon, 7 Nov 2016 20:16:44 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id B954E160AEC; Mon, 7 Nov 2016 19:16:44 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 07620160AE0 for ; Mon, 7 Nov 2016 20:16:43 +0100 (CET) Received: (qmail 30932 invoked by uid 500); 7 Nov 2016 19:16:43 -0000 Mailing-List: contact reviews-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list reviews@impala.incubator.apache.org Received: (qmail 30921 invoked by uid 99); 7 Nov 2016 19:16:42 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Nov 2016 19:16:42 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 14551C12B0 for ; Mon, 7 Nov 2016 19:16:42 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.362 X-Spam-Level: X-Spam-Status: No, score=0.362 tagged_above=-999 required=6.31 tests=[RDNS_DYNAMIC=0.363, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id vx5ZqObyRYKB for ; Mon, 7 Nov 2016 19:16:40 +0000 (UTC) Received: from ip-10-146-233-104.ec2.internal (ec2-75-101-130-251.compute-1.amazonaws.com [75.101.130.251]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id BE8625FBDC for ; Mon, 7 Nov 2016 19:16:39 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ip-10-146-233-104.ec2.internal (8.14.4/8.14.4) with ESMTP id uA7JG8jh011482; Mon, 7 Nov 2016 19:16:08 GMT Message-Id: <201611071916.uA7JG8jh011482@ip-10-146-233-104.ec2.internal> Date: Mon, 7 Nov 2016 19:16:07 +0000 From: "Alex Behm (Code Review)" To: impala-cr@cloudera.com, reviews@impala.incubator.apache.org Reply-To: alex.behm@cloudera.com X-Gerrit-MessageType: newpatchset Subject: =?UTF-8?Q?=5BImpala-ASF-CR=5D_IMPALA-3167=3A_Fix_assignment_of_WHERE-clause_predicate_through_grouping_agg_+_outer_join=2E=0A?= X-Gerrit-Change-Id: I774d13a13ad1e8fe82512df98dc29983bdd232eb X-Gerrit-ChangeURL: X-Gerrit-Commit: 6f14f15461b1a1fe078292bf4598cc6f0ffd0559 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-Disposition: inline User-Agent: Gerrit/2.12.2 archived-at: Mon, 07 Nov 2016 19:16:44 -0000 Alex Behm has uploaded a new patch set (#2). Change subject: IMPALA-3167: Fix assignment of WHERE-clause predicate through grouping agg + outer join. ...................................................................... IMPALA-3167: Fix assignment of WHERE-clause predicate through grouping agg + outer join. Background: We generally allow the assignment of predicates below the nullable side of a left/right outer join, explained as follows using an example: SELECT * FROM t1 LEFT OUTER JOIN t2 ON t1.id = t2.id WHERE t2.int_col < 10 The scan of 't2' will pick up 't2.int_col < 10' via Analyzer.getBoundPredicates() and recognizes that the predicate must also be evaluated by a join later, so the predicate is not marked as assigned. The join then picks up the unassigned predicate via Analyzer.getUnassignedConjuncts(). The bug was that our logic for detecting whether a bound predicate must also be evaluated at a join node was flawed because it only considered whether the tuples of the source or destination predicate were outer joined (plus other conditions). The underlying assumption is that either the source or destination tuple are bound by a tuple produced by a TableRef, but in the buggy query the source predicate is bound by an aggregation tuple, so we incorrectly marked the bound predicate as assigned in Analyzer.getBoundPredicates(). The fix is to conservatively not mark bound predicates as assigned if there exists equivalent tuples that are outer joined. This conservative fix leads to some duplicate assignments of predicates. Those are simply deduped now. Change-Id: I774d13a13ad1e8fe82512df98dc29983bdd232eb --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test 6 files changed, 48 insertions(+), 27 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/4960/2 -- To view, visit http://gerrit.cloudera.org:8080/4960 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I774d13a13ad1e8fe82512df98dc29983bdd232eb Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Anonymous Coward #27