Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 02DD9200D41 for ; Wed, 22 Nov 2017 19:04:34 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 01499160BFD; Wed, 22 Nov 2017 18:04:34 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 47C24160BEC for ; Wed, 22 Nov 2017 19:04:33 +0100 (CET) Received: (qmail 98384 invoked by uid 500); 22 Nov 2017 18:04:32 -0000 Mailing-List: contact reviews-help@impala.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list reviews@impala.apache.org Received: (qmail 98373 invoked by uid 99); 22 Nov 2017 18:04:32 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 22 Nov 2017 18:04:32 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 72EFDC8470 for ; Wed, 22 Nov 2017 18:04:31 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.562 X-Spam-Level: ** X-Spam-Status: No, score=2.562 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=2, KB_WAM_FROM_NAME_SINGLEWORD=0.2, RDNS_DYNAMIC=0.363, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id PZDH4_Use8o7 for ; Wed, 22 Nov 2017 18:04:30 +0000 (UTC) Received: from ip-10-146-233-104.ec2.internal (ec2-75-101-130-251.compute-1.amazonaws.com [75.101.130.251]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 751045FAF3 for ; Wed, 22 Nov 2017 18:04:30 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ip-10-146-233-104.ec2.internal (8.14.4/8.14.4) with ESMTP id vAMI4TSk001786; Wed, 22 Nov 2017 18:04:29 GMT Message-Id: <201711221804.vAMI4TSk001786@ip-10-146-233-104.ec2.internal> X-Gerrit-PatchSet: 10 Date: Wed, 22 Nov 2017 18:04:28 +0000 From: "Vuk Ercegovac (Code Review)" To: Lars Volker , Alex Behm , impala-cr@cloudera.com, reviews@impala.incubator.apache.org X-Gerrit-MessageType: newpatchset Subject: =?UTF-8?Q?=5BImpala-ASF-CR=5D_IMPALA-4985=3A_use_parquet_stats_of_nested_types_for_dynamic_pruning=0A?= X-Gerrit-Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe X-Gerrit-Change-Number: 8480 X-Gerrit-ChangeURL: X-Gerrit-Commit: 7f30949ba3b9723744771a65f958a1562bee54b8 In-Reply-To: References: Reply-To: vercegovac@cloudera.com, impala-cr@cloudera.com, lv@cloudera.com, marcelk@gmail.com, alex.behm@cloudera.com, reviews@impala.incubator.apache.org MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Disposition: inline User-Agent: Gerrit/2.14.2 Content-Type: multipart/alternative; boundary="mVPu6CD/J6M="; charset=UTF-8 archived-at: Wed, 22 Nov 2017 18:04:34 -0000 --mVPu6CD/J6M= Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hello Lars Volker, Alex Behm, Impala Public Jenkins, I'd like you to reex= amine a change=2E Please visit http://gerrit=2Ecloudera=2Eorg:8080/848= 0 to look at the new patch set (#10)=2E Change subject: IMPALA-4985: use = parquet stats of nested types for dynamic pruning =2E=2E=2E=2E=2E=2E=2E=2E= =2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E= =2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E= =2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E IMPALA-4985: use parquet stats of nes= ted types for dynamic pruning Currently, parquet row-groups can be pruned = at run-time using min/max stats when predicates (in, binary) are specified = for column scalar types=2E This patch extends pruning to nested types for t= he same class of predicates=2E A nested value is an instance of a nested ty= pe (struct, array, map)=2E A nested value consists of other nested and scal= ar values (as declared by its type)=2E Predicates that can be used for row-= group pruning must be applied to nested scalar values=2E In addition, the p= arent of the nested scalar must also be required, that is, not empty=2E The= latter requirement is conservative: some filters that could be used for pr= uning are not used for correctness reasons=2E Testing: - extended nested-t= ypes-parquet-stats e2e test cases=2E Change-Id: I0c99e20cb080b504442cd5376= ea3e046016158fe --- M be/src/exec/hdfs-parquet-scanner=2Eh M fe/src/main/ja= va/org/apache/impala/analysis/CollectionStructType=2Ejava M fe/src/main/jav= a/org/apache/impala/analysis/SelectStmt=2Ejava M fe/src/main/java/org/apach= e/impala/analysis/SlotRef=2Ejava M fe/src/main/java/org/apache/impala/plann= er/HdfsScanNode=2Ejava M testdata/workloads/functional-planner/queries/Plan= nerTest/constant-folding=2Etest M testdata/workloads/functional-planner/que= ries/PlannerTest/mt-dop-validation=2Etest M testdata/workloads/functional-p= lanner/queries/PlannerTest/parquet-filtering=2Etest M testdata/workloads/fu= nctional-query/queries/QueryTest/nested-types-parquet-stats=2Etest M tests/= query_test/test_nested_types=2Epy 10 files changed, 649 insertions(+), 38 d= eletions(-) git pull ssh://gerrit=2Ecloudera=2Eorg:29418/Impala-ASF ref= s/changes/80/8480/10 -- To view, visit http://gerrit=2Ecloudera=2Eorg:8080= /8480 To unsubscribe, visit http://gerrit=2Ecloudera=2Eorg:8080/settings G= errit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatc= hset Gerrit-Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe Gerrit-Cha= nge-Number: 8480 Gerrit-PatchSet: 10 Gerrit-Owner: Vuk Ercegovac Gerrit-Reviewer: Alex Behm G= errit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac --mVPu6CD/J6M=--