Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 692DE200D29 for ; Thu, 26 Oct 2017 22:55:05 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 67B09160BF3; Thu, 26 Oct 2017 20:55:05 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id AEA911609E5 for ; Thu, 26 Oct 2017 22:55:04 +0200 (CEST) Received: (qmail 76869 invoked by uid 500); 26 Oct 2017 20:55:03 -0000 Mailing-List: contact reviews-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list reviews@impala.incubator.apache.org Received: (qmail 76857 invoked by uid 99); 26 Oct 2017 20:55:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Oct 2017 20:55:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id DC0571A17EE for ; Thu, 26 Oct 2017 20:55:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.363 X-Spam-Level: ** X-Spam-Status: No, score=2.363 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=2, RDNS_DYNAMIC=0.363, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id u3CIFz3yBZmS for ; Thu, 26 Oct 2017 20:55:01 +0000 (UTC) Received: from ip-10-146-233-104.ec2.internal (ec2-75-101-130-251.compute-1.amazonaws.com [75.101.130.251]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 34E925FD8B for ; Thu, 26 Oct 2017 20:55:01 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ip-10-146-233-104.ec2.internal (8.14.4/8.14.4) with ESMTP id v9QKsxT0009206; Thu, 26 Oct 2017 20:54:59 GMT Message-Id: <201710262054.v9QKsxT0009206@ip-10-146-233-104.ec2.internal> X-Gerrit-PatchSet: 8 Date: Thu, 26 Oct 2017 20:54:59 +0000 From: "Thomas Tauber-Marshall (Code Review)" To: impala-cr@cloudera.com, reviews@impala.incubator.apache.org CC: Matthew Jacobs , Michael Ho , Mostafa Mokhtar , Lars Volker , Tim Armstrong , Todd Lipcon X-Gerrit-MessageType: comment Subject: =?UTF-8?Q?=5BImpala-ASF-CR=5D_IMPALA-4252=3A_Min-max_runtime_filters_for_Kudu=0A?= X-Gerrit-Change-Id: I02bad890f5b5f78388a3041bf38f89369b5e2f1c X-Gerrit-Change-Number: 7793 X-Gerrit-ChangeURL: X-Gerrit-Commit: 40fd1df7e98ee60093e4f402e867731bb1f8c7f0 In-Reply-To: References: X-Gerrit-Comment-Date: Thu, 26 Oct 2017 20:54:59 +0000 Reply-To: tmarshall@cloudera.com, impala-cr@cloudera.com, lv@cloudera.com, marcelk@gmail.com, kwho@cloudera.com, tarmstrong@cloudera.com, mmokhtar@cloudera.com, todd@apache.org, reviews@impala.incubator.apache.org, mjacobs@apache.org MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Disposition: inline User-Agent: Gerrit/2.14.2 Content-Type: multipart/alternative; boundary="ai+j/l78DME="; charset=UTF-8 archived-at: Thu, 26 Oct 2017 20:55:05 -0000 --ai+j/l78DME= Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Thomas Tauber-Marshall has posted comments on this change=2E ( http://gerri= t=2Ecloudera=2Eorg:8080/7793 ) Change subject: IMPALA-4252: Min-max runtim= e filters for Kudu =2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E= =2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E= =2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E=2E= =2E=2E Patch Set 8: > > Patch Set 7: > > > > > > Patch Set 7: > > > = > > > > > Perf results: > > > > =2E=2E=2E > > > > > > I'm surprised= that only a few queries saw significant > speedups=2E Is > > > this in = line with what you saw with Parquet runtime filters on > > > TPC-H? Or ar= e we losing a lot by using min/max instead of > bloom or > > > in-list s= tyle filters? > > > > Not sure about bloom filters perf, though I can run= those numbers > for comparison=2E > > I haven't looked at this patch, = but had a question about the > design: > > Are we still pushing blooms = across a join to prevent shuffling of > data? Or are we now pushing _only_= min/max? > > It seems there is value in pushing both: the bloom for eva= luation > on the other side of the join to prevent shuffling, and the min/= max > to push all the way to the scanner to reduce I/O=2E > > Not sure = if the patch is already doing this=2E Impala only evaluates runtime filter= s in the scan=2E Even prior to this patch, the Kudu scanner was not evaluat= ing bloom filters (and hash joins with Kudu scan targets don't build bloom = filters)=2E It certainly could be useful to evaluate bloom filters on the = Impala side of a Kudu scan, but I believe our thinking was that it wasn't w= orth it to implement that - better to just wait until bloom filters can be = pushed all the way down into Kudu=2E If bloom filters in Kudu are a long wa= y off, though, we should maybe reevaluate that=2E -- To view, visit http= ://gerrit=2Ecloudera=2Eorg:8080/7793 To unsubscribe, visit http://gerrit=2E= cloudera=2Eorg:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: mas= ter Gerrit-MessageType: comment Gerrit-Change-Id: I02bad890f5b5f78388a3041b= f38f89369b5e2f1c Gerrit-Change-Number: 7793 Gerrit-PatchSet: 8 Gerrit-Owner= : Thomas Tauber-Marshall Gerrit-Reviewer: Anonym= ous Coward #345 Gerrit-Reviewer: Lars Volker Gerrit-Rev= iewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall Gerr= it-Reviewer: Tim Armstrong Gerrit-Reviewer: Tod= d Lipcon Gerrit-Comment-Date: Thu, 26 Oct 2017 20:54:59= +0000 Gerrit-HasComments: No --ai+j/l78DME=--