Return-Path: X-Original-To: apmail-impala-dev-archive@minotaur.apache.org Delivered-To: apmail-impala-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0EBCA18B4E for ; Wed, 23 Mar 2016 22:15:18 +0000 (UTC) Received: (qmail 62793 invoked by uid 500); 23 Mar 2016 22:15:18 -0000 Delivered-To: apmail-impala-dev-archive@impala.apache.org Received: (qmail 62756 invoked by uid 500); 23 Mar 2016 22:15:17 -0000 Mailing-List: contact dev-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@impala.incubator.apache.org Delivered-To: mailing list dev@impala.incubator.apache.org Received: (qmail 62744 invoked by uid 99); 23 Mar 2016 22:15:17 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 Mar 2016 22:15:17 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id ED0021A04DC for ; Wed, 23 Mar 2016 22:15:16 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.362 X-Spam-Level: X-Spam-Status: No, score=0.362 tagged_above=-999 required=6.31 tests=[RDNS_DYNAMIC=0.363, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id xPhPaL00Xyvn for ; Wed, 23 Mar 2016 22:15:15 +0000 (UTC) Received: from ip-10-146-233-104.ec2.internal (ec2-75-101-130-251.compute-1.amazonaws.com [75.101.130.251]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id C25635F560 for ; Wed, 23 Mar 2016 22:15:14 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ip-10-146-233-104.ec2.internal (8.14.4/8.14.4) with ESMTP id u2NMFDNA029893; Wed, 23 Mar 2016 22:15:13 GMT Message-Id: <201603232215.u2NMFDNA029893@ip-10-146-233-104.ec2.internal> Date: Wed, 23 Mar 2016 22:15:10 +0000 From: "Henry Robinson (Code Review)" To: Marcel Kornacker , impala-cr@cloudera.com, dev@impala.incubator.apache.org Reply-To: henry@cloudera.com X-Gerrit-MessageType: newpatchset Subject: =?UTF-8?Q?[Impala-CR](cdh5-trunk)_IMPALA-3141:_Send_dummy_filters_when_filter_production_is_disabled=0A?= X-Gerrit-Change-Id: I04b3e6542651c1e7b77a9bab01d0e3d9506af42f X-Gerrit-ChangeURL: X-Gerrit-Commit: a22d188d1085b57ae427044eca5dd62dcbae6855 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-Disposition: inline User-Agent: Gerrit/2.10-rc0 Hello Marcel Kornacker, Internal Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/2475 to look at the new patch set (#11). Change subject: IMPALA-3141: Send dummy filters when filter production is disabled ...................................................................... IMPALA-3141: Send dummy filters when filter production is disabled The PHJ may disable runtime filter production for one of several reasons, including a predicted high false-positive rate. If the filters are not produced, any scans will wait for their entire timeout before continuing. This patch changes the filter logic to always send a filter, even if one wasn't actually produced by the PHJ. To preserve correctness, that filter must contain every element of the set. Such a filter is represented by (BloomFilter*)NULL. This allows us to make no changes to RuntimeFilter::Eval(), which already returns true if the member Bloom filter is NULL. In RPCs, a new field is added to TBloomFilter to identify filters that are always true. The HdfsParquetScanner checks to see if filters would always return true for any element, and disables them if so. There is some miscellaneous cleanup in this patch, particularly the removal of unused members in BloomFilter. This patch has been manually tested on queries that would otherwise take a long time to time-out. A unit test was added to ensure that queries do not wait. Change-Id: I04b3e6542651c1e7b77a9bab01d0e3d9506af42f --- M be/src/benchmarks/bloom-filter-benchmark.cc M be/src/exec/blocking-join-node.cc M be/src/exec/blocking-join-node.h M be/src/exec/hash-join-node.cc M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/runtime/coordinator.cc M be/src/runtime/runtime-filter.cc M be/src/runtime/runtime-filter.h M be/src/runtime/runtime-filter.inline.h M be/src/util/bloom-filter-test.cc M be/src/util/bloom-filter.cc M be/src/util/bloom-filter.h M be/src/util/cpu-info.cc M be/src/util/cpu-info.h M common/thrift/ImpalaInternalService.thrift M common/thrift/PlanNodes.thrift M fe/src/main/java/com/cloudera/impala/planner/HashJoinNode.java M testdata/workloads/functional-query/queries/QueryTest/runtime_filters_wait.test 21 files changed, 226 insertions(+), 178 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/75/2475/11 -- To view, visit http://gerrit.cloudera.org:8080/2475 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I04b3e6542651c1e7b77a9bab01d0e3d9506af42f Gerrit-PatchSet: 11 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker