Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 40CBD200D18 for ; Wed, 27 Sep 2017 04:13:35 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 3F6531609EA; Wed, 27 Sep 2017 02:13:35 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 840FB1609D7 for ; Wed, 27 Sep 2017 04:13:34 +0200 (CEST) Received: (qmail 61657 invoked by uid 500); 27 Sep 2017 02:13:33 -0000 Mailing-List: contact reviews-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list reviews@impala.incubator.apache.org Received: (qmail 61646 invoked by uid 99); 27 Sep 2017 02:13:33 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Sep 2017 02:13:33 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 06C0AC645B for ; Wed, 27 Sep 2017 02:13:33 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.362 X-Spam-Level: ** X-Spam-Status: No, score=2.362 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=2, RDNS_DYNAMIC=0.363, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id FI0dmqtak1On for ; Wed, 27 Sep 2017 02:13:31 +0000 (UTC) Received: from ip-10-146-233-104.ec2.internal (ec2-75-101-130-251.compute-1.amazonaws.com [75.101.130.251]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 5781B5F640 for ; Wed, 27 Sep 2017 02:13:31 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ip-10-146-233-104.ec2.internal (8.14.4/8.14.4) with ESMTP id v8R2DSfS015961; Wed, 27 Sep 2017 02:13:28 GMT Message-Id: <201709270213.v8R2DSfS015961@ip-10-146-233-104.ec2.internal> X-Gerrit-PatchSet: 4 Date: Wed, 27 Sep 2017 02:13:28 +0000 From: "Thomas Tauber-Marshall (Code Review)" To: Michael Ho , Lars Volker , Matthew Jacobs , Mostafa Mokhtar , impala-cr@cloudera.com, reviews@impala.incubator.apache.org X-Gerrit-MessageType: newpatchset Subject: =?UTF-8?Q?=5BImpala-ASF-CR=5D_DRAFT_-_IMPALA-4252=3A_Min-max_runtime_filters_for_Kudu=0A?= X-Gerrit-Change-Id: I02bad890f5b5f78388a3041bf38f89369b5e2f1c X-Gerrit-Change-Number: 7793 X-Gerrit-ChangeURL: X-Gerrit-Commit: c5ee26164855d171f1fa94cf89dea65aebf6b423 In-Reply-To: References: Reply-To: tmarshall@cloudera.com, impala-cr@cloudera.com, lv@cloudera.com, marcelk@gmail.com, kwho@cloudera.com, mmokhtar@cloudera.com, reviews@impala.incubator.apache.org, mjacobs@apache.org MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Disposition: inline User-Agent: Gerrit/2.14.2 Content-Type: multipart/alternative; boundary="XeLTBQQiFGo="; charset=UTF-8 archived-at: Wed, 27 Sep 2017 02:13:35 -0000 --XeLTBQQiFGo= Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hello Michael Ho, Lars Volker, Matthew Jacobs, Mostafa Mokhtar, I'd like = you to reexamine a change=2E Please visit http://gerrit=2Ecloudera=2Eo= rg:8080/7793 to look at the new patch set (#4)=2E Change subject: DRAFT -= IMPALA-4252: Min-max runtime filters for Kudu =2E=2E=2E=2E=2E=2E=2E=2E=2E= =2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E= =2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E= =2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E DRAFT - IMPALA-4252: Min-max runtime fil= ters for Kudu This patch implements min-max filters for runtime filters=2E= Each runtime filter generates a bloom filter and/or a min-max filter, depe= nding on if it has HDFS and/or Kudu targets, respectively=2E Min-max filte= rs are generated by the PartitionedHashJoinBuilder=2E For now, min-max filt= ers are only applied at the KuduScanner, which passes them into the Kudu cl= ient=2E Because the Kudu client doesn't provide a way to specify generic fi= lter exprs, min-max filters are only generated when the target expr is a ba= re Kudu column ref=2E Future work will address applying min-max filters at= HDFS scan nodes and applying bloom filters at Kudu scan nodes=2E Codegen = is used to eliminate branching on the type of the min-max filter=2E Testin= g: - Updated planner tests=2E - Ran existing runtime filter tests=2E - Ran = preliminary perf tests to demonstrate that it works=2E Will update with m= ore specific results=2E - Still needs more e2e tests=2E Change-Id: I02bad8= 90f5b5f78388a3041bf38f89369b5e2f1c --- M be/src/codegen/gen_ir_descriptions= =2Epy M be/src/codegen/impala-ir=2Ecc M be/src/exec/filter-context=2Ecc M b= e/src/exec/filter-context=2Eh M be/src/exec/hdfs-parquet-scanner-ir=2Ecc M = be/src/exec/hdfs-parquet-scanner=2Ecc M be/src/exec/hdfs-scan-node-base=2Ec= c M be/src/exec/kudu-scan-node-base=2Ecc M be/src/exec/kudu-scan-node=2Ecc = M be/src/exec/kudu-scanner=2Ecc M be/src/exec/kudu-util=2Ecc M be/src/exec/= kudu-util=2Eh M be/src/exec/partitioned-hash-join-builder=2Ecc M be/src/run= time/coordinator-backend-state=2Ecc M be/src/runtime/coordinator-filter-sta= te=2Eh M be/src/runtime/coordinator=2Ecc M be/src/runtime/fragment-instance= -state=2Ecc M be/src/runtime/fragment-instance-state=2Eh M be/src/runtime/q= uery-state=2Ecc M be/src/runtime/query-state=2Eh M be/src/runtime/runtime-f= ilter-bank=2Ecc M be/src/runtime/runtime-filter-bank=2Eh M be/src/runtime/r= untime-filter=2Ecc M be/src/runtime/runtime-filter=2Eh M be/src/runtime/run= time-filter=2Einline=2Eh M be/src/service/impala-internal-service=2Ecc M be= /src/util/CMakeLists=2Etxt A be/src/util/min-max-filter-ir=2Ecc A be/src/ut= il/min-max-filter=2Ecc A be/src/util/min-max-filter=2Eh M common/thrift/Imp= alaInternalService=2Ethrift M common/thrift/PlanNodes=2Ethrift M fe/src/mai= n/java/org/apache/impala/planner/KuduScanNode=2Ejava M fe/src/main/java/org= /apache/impala/planner/RuntimeFilterGenerator=2Ejava M testdata/workloads/f= unctional-planner/queries/PlannerTest/kudu-update=2Etest M testdata/workloa= ds/functional-planner/queries/PlannerTest/kudu=2Etest M testdata/workloads/= functional-planner/queries/PlannerTest/tpch-kudu=2Etest 37 files changed, 1= ,464 insertions(+), 146 deletions(-) git pull ssh://gerrit=2Ecloudera= =2Eorg:29418/Impala-ASF refs/changes/93/7793/4 -- To view, visit http://ge= rrit=2Ecloudera=2Eorg:8080/7793 To unsubscribe, visit http://gerrit=2Ecloud= era=2Eorg:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master G= errit-MessageType: newpatchset Gerrit-Change-Id: I02bad890f5b5f78388a3041bf= 38f89369b5e2f1c Gerrit-Change-Number: 7793 Gerrit-PatchSet: 4 Gerrit-Owner:= Thomas Tauber-Marshall Gerrit-Reviewer: Lars Vo= lker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mos= tafa Mokhtar --XeLTBQQiFGo=--