impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Henry Robinson (Code Review)" <ger...@cloudera.org>
Subject [Impala-CR](cdh5-trunk) IMPALA-3007: Adjust Bloom Filter size according to NDV estimate
Date Fri, 29 Apr 2016 05:20:00 GMT
Henry Robinson has uploaded a new patch set (#7).

Change subject: IMPALA-3007: Adjust Bloom Filter size according to NDV estimate
......................................................................

IMPALA-3007: Adjust Bloom Filter size according to NDV estimate

Instead of having a default Bloom Filter size for all runtime filters,
adjust filter size according to desired FP-rate and expected NDV from
join's build-side. Size of filter is still clipped to 4k < N < 16MB range.

If NDV estimate from planner is -1 (i.e. no stats) the default filter
size is used.

The NDV of all filters produced by the same join is currently the same
because the NDV is estimated from the cardinality of the input. In the
future, the NDV should be estimated for each filter source expr. The BE
changes anticipate this and can enable or disable individual filters if
they have differing FP rates.

Change-Id: I1fe37b8d4cfb3c52bb8e8cf0ca55e92665b87803
---
M be/src/exec/hash-join-node.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/old-hash-table.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/runtime/runtime-filter.cc
M be/src/runtime/runtime-filter.h
M be/src/runtime/runtime-filter.inline.h
M be/src/util/bloom-filter-test.cc
M be/src/util/bloom-filter.cc
M common/thrift/PlanNodes.thrift
M fe/src/main/java/com/cloudera/impala/planner/DistributedPlanner.java
M fe/src/main/java/com/cloudera/impala/planner/RuntimeFilterGenerator.java
M testdata/workloads/functional-query/queries/QueryTest/runtime_filters.test
M testdata/workloads/functional-query/queries/QueryTest/runtime_filters_wait.test
M testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters.test
M testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters_phj.test
16 files changed, 230 insertions(+), 103 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/12/2812/7
-- 
To view, visit http://gerrit.cloudera.org:8080/2812
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1fe37b8d4cfb3c52bb8e8cf0ca55e92665b87803
Gerrit-PatchSet: 7
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <marcel@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mmokhtar@cloudera.com>

Mime
View raw message