impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5789: Add always false flag in bloom filter
Date Tue, 03 Oct 2017 23:32:29 GMT
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8170 )

Change subject: IMPALA-5789: Add always_false flag in bloom filter
......................................................................


Patch Set 1:

(2 comments)

Will wait for the new PS but had a couple of high-level bits of feedback first. It may be
worth looking at Thomas' patches if you haven't already to make sure your changes are compatible.

http://gerrit.cloudera.org:8080/#/c/8170/1/be/src/exec/hdfs-scanner.cc
File be/src/exec/hdfs-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8170/1/be/src/exec/hdfs-scanner.cc@104
PS1, Line 104:     if (FilterContext::HasAlwaysFalseInList(FilterStats::SPLITS_KEY,
Not sure if this is relevant but there is a known problem with filtering out splits with some
scanners: IMPALA-3804


http://gerrit.cloudera.org:8080/#/c/8170/1/be/src/util/bloom-filter.h
File be/src/util/bloom-filter.h:

http://gerrit.cloudera.org:8080/#/c/8170/1/be/src/util/bloom-filter.h@106
PS1, Line 106:     if (always_false_) return 0;
I think this change will mess up some accounting in RuntimeFilterBank.

My preference is to continue to allocate the memory when the BloomFilter is constructed instead
of allocating it lazily. It would waste memory in the always_false case but I think we can
live with that because it doesn't increase worst-case memory consumption.



-- 
To view, visit http://gerrit.cloudera.org:8080/8170
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If680240a3cd4583fc97c3192177d86d9567c4f8d
Gerrit-Change-Number: 8170
Gerrit-PatchSet: 1
Gerrit-Owner: Tianyi Wang <twang@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Comment-Date: Tue, 03 Oct 2017 23:32:29 +0000
Gerrit-HasComments: Yes

Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message